Improving Viability of Electric Taxis by Taxi Service Strategy Optimization: A Big Data Study of New York City

# Improving Viability of Electric Taxis by Taxi Service Strategy Optimization: A Big Data Study of New York City

Chien-Ming Tseng, Sid Chi-Kin Chau and Xue Liu S. C.-K. Chau is with the Australian National University. X. Liu is with McGill University. (Email: chi-kin.chau@cl.cam.ac.uk, xueliu@cs.mcgill.ca).This paper appears in IEEE Transactions on Intelligent Transportation Systems (DOI:10.1109/TITS.2018.2839265).
###### Abstract

Electrification of transportation is critical for a low-carbon society. In particular, public vehicles (e.g., taxis) provide a crucial opportunity for electrification. Despite the benefits of eco-friendliness and energy efficiency, adoption of electric taxis faces several obstacles, including constrained driving range, long recharging duration, limited charging stations and low gas price, all of which impede taxi drivers’ decisions to switch to electric taxis. On the other hand, the popularity of ride-hailing mobile apps facilitates the computerization and optimization of taxi service strategies, which can provide computer-assisted decisions of navigation and roaming for taxi drivers to locate potential customers. This paper examines the viability of electric taxis with the assistance of taxi service strategy optimization, in comparison with conventional taxis with internal combustion engines. A big data study is provided using a large dataset of real-world taxi trips in New York City. Our methodology is to first model the computerized taxi service strategy by Markov Decision Process (MDP), and then obtain the optimized taxi service strategy based on NYC taxi trip dataset. The profitability of electric taxi drivers is studied empirically under various battery capacity and charging conditions. Consequently, we shed light on the solutions that can improve viability of electric taxis.

Electric vehicles, big data study, taxi service strategy optimization

## I Introduction

Taxis are an important part of public transportation system, offering both flexibility of private vehicles and shareability of public transportation. In many cities around the world, there are usually a large of number of taxis, serving the ad hoc demands of commuters. Notably, taxis consume a large amount of fuel. For example, there are over 13,000 taxis operating in New York City, which totally travel over 1.46 billion kilometers each year111According to New York City taxi trip dataset in 2013 [1]., and consume over 86 million liters of gasoline. As a result, they emit over 242,900 metric tons of CO per year222Estimated by assuming 67% of New York Yellow taxis as hybrid vehicles and 33% as ICE vehicles, as in 2016., which is equivalent to the amount of around 25,650 US households’ average annual CO emissions333The average annual CO emission for US household is 9.5 metric tons [2].. A viable path toward a low-carbon sustainable society is to promote electrification of transportation, replacing internal combustion engine (ICE) vehicles by more environment-friendly and energy-efficient electric vehicles (EVs). Electrification of private vehicles faces many obstacles, such as cost-effectiveness, availability of home charging infrastructure and users’ perception. However, electrification of public vehicles (e.g., buses, taxis) would be subject to fewer concerns, with even a greater potential impact than that of private vehicles. First, public vehicles are used more frequently, whose electrification can effectively reduce greenhouse gas emissions. Second, public vehicles are likely to park in common facilities, facilitating the installation of charging stations. Third, public vehicles generally have shorter life cycles due to frequent usage, and hence, are more ready to be replaced.

Major cities worldwide are introducing plans to phase out conventional ICE public vehicles for electric vehicles. For example, Chinese government has initiated several programs to promote electrification of public vehicles for air pollution mitigation [3]. Electric taxi programs were launched in Shenzhen (in 2010) and Beijing (in 2014) to convert taxis to electric vehicles, along with the installation of sufficient EV parking lots and fast charging points. In these programs, the government also offer subsidies to taxi operators. Singapore government plans to roll out a total of 1,000 electric cars to be supported by 2,000 charging points across the city by 2020.

Nonetheless, unlike buses, taxis are often operated as private businesses. Adoption of electric taxis critically depends on the willingness of taxi drivers to switch to electric taxis from conventional ICE taxis. However, it is not clear whether taxi drivers are willing to do so. Despite the initiatives from the governments, there are notable shortcomings of electric taxis:

1. Constrained Driving Range: One of the barriers preventing wide adoptions of EVs is a shorter driving range. With increasing battery capacity, the driving range has been extended to more than 200 kilometers in production EVs such as Chevrolet Bolt. Generally, the driving ranges of production EVs are sufficient for daily commutes of personal purposes. However, a longer driving range is normally required by logistic vehicles and taxis (e.g., more than 300 kilometers). The driving range of high-end Tesla (as in 2017) may suffice to meet the required driving distance, but are too costly for practical taxis.

2. Long Recharging Duration: Recharging the battery of EVs can take considerable time. For example, charging Nissan Leaf with 30 kwh battery capacity can take up to 4 hours using mode 3 charging, or half an hour using fast DC charging (without considering queuing delay). Taxis traveling long distances are likely to take more than an hour for recharging between shifts, which is significantly longer than ICE taxis with faster refilling of gasoline.

3. Limited Charging Stations: Todays, the number of charging stations are few. Also, some of charging stations are reserved for specific models or brands with proprietary connectors. The expansion of charging stations is hampered by electrical infrastructure in certain regions. As a result, electric taxi drivers always need sufficient reserve battery capacity in order to be able to return to certain known charging stations, in case of emergence.

4. Low Gas Price: Nowadays, the oil price has come down considerably from historic heights. This reduces the incentive to adopt EVs, as the gasoline is relatively affordable, despite cheaper and cleaner electricity sources. Unless carbon tax is introduced to mitigate greenhouse gas emissions, gasoline ICE vehicles are still perceived as cost-effective by the public in general.

These shortcomings are likely to dissuade taxi drivers from adopting electric taxis. Particularly, it is not easy to operate a taxi under the constraints of shorter driving range and limited charging stations, in comparison with conventional taxis. In fact, it has been reported in media that taxi drivers tended to shun electric taxis. Without taxi drivers’ participation, it is futile to promote electric taxis. Therefore, it is important to provide a viability analysis of electric taxis. Such an analysis can also be used as a basis to determine proper governmental subsidies for electric taxis to promote their adoptions.

In this paper, we identify that a key problem of adopting electric taxis is the ineffective service strategies practiced by today’s taxi drivers. In fact, we show that properly optimized taxi service strategies will not suffer from the shortcomings of electric vehicles. Therefore, there is a need to provide an intelligent recommender system to assist taxi drivers to improve their taxi service strategies, and hence, to increase their willingness to switch to electric taxis. In particular, there is a popular trend of ride-hailing mobile apps, which facilitates the computerization and optimization of taxi service strategies, and provide an opportunity of integrating computer-assisted optimized decisions of roaming and navigation to taxi drivers.

### I-a Modeling Taxi Service Strategy by MDP

The net revenue of a taxi driver (i.e., the revenue from taxi fares minus energy costs) is determined by his/her service strategy of passenger searching and efficiency of passenger delivery. For example, skilful taxi drivers can identify the popular spots for potential passengers, and deliver passengers efficiently by choosing faster routes. Note that the service strategies of taxi drivers can be effectively optimized by utilizing a large historical taxi trip dataset for demand prediction.

To optimize taxi service strategies for electric (or ICE) taxis, we first model computerized taxi service strategy by Markov Decision Process (MDP). MDP is a general framework for optimizing sequential decision process in the presence of uncertainty. In summary, we denote a Markov state as the time and location (and possibly battery state) of a taxi, and an action as the driver’s decision to travel to the next location (and possibly recharging operations). At each location, there is a probabilistic transition to another location. The transition is determined by a random event of passenger pick-up. The uncertainty in taxi service strategy is the pick-up location and destination of a passenger, which can be estimated by a historical taxi trip dataset.

This MDP model facilitates the optimization of computerized taxi service strategies by providing computer-assisted decisions to taxi drivers. Since human taxi service strategies are inherently inefficient, optimizing computerized taxi service strategies can potentially improve the net revenues of taxi drivers, particularly in presence of constraints of driving range and charging stations. Computerized taxi service strategies are becoming more feasible, because the increasing adoption of ride-hailing mobile apps, which facilitates the integration of computerized taxi service strategies in a recommender system for taxi drivers using real-time data analytics from historical taxi trip dataset. In this paper, we obtain the optimal policy of MDP that maximizes the revenue of a taxi driver based on New York City taxi trip dataset, and study the profitability of electric taxi drivers under various conditions of battery capacity and charging modes.

### I-B Summary

Our contributions in this paper are summarized as follows:

1. We formulate an MDP to model computerized electric taxi service strategies, with explicit consideration of constraints of EVs, such as battery capacity and locations of charging stations.

2. We obtain the optimal policy of the MDP based on a big data study using a large dataset of real-world taxi trips in New York City.

3. We study the impact of factors such as battery capacity and charging modes, and locations of charging stations on the net revenues of electric taxi drivers.

4. We project our study to understand the benefits of a wider adoption of electric taxis (up to 1000 taxis).

## Ii Background

### Ii-a Related Work

Analyzing taxi trip dataset has been considered by several research papers in the subjects of knowledge discovery and cloud-based intelligent transportation systems [4]. One of the popular topics is the profit/revenue improvement for taxi drivers by developing a recommender system for assisting the drivers to find passengers more efficiently. The basic idea is to identify the good taxi service strategies. Several characteristics of taxi service strategies are reported in [5]. Their study shows that searching passengers near the drop-off location of previous passengers results in a higher revenue. They also found that better taxi drivers can deliver the passengers efficiently by choosing a uncongested route. Furthermore, GPS mobility trace from taxis can be used to predict future traffic conditions and optimize the route selections [6]. Also, community detection has been applied to the mobility trace to reveal potential similar passengers’ travel patterns, as for social recommendation [7] and improving transportation services [8].

Other studies focus on the specific methods for improving the profit/revenue of the taxi drivers. One approach in [9] shows that experienced taxi drivers usually waits for passengers at specific locations, and they are usually aware of particular events like train arrivals or ending times of movies.

Instead of recommending separate pick-up locations, a better approach is to maximize the revenue by selecting a route of a sequence of likely pick-up locations at different times. The top-k profitable driving routes can be computed based on a route network with revenues and pick-up probabilities from historical taxi trip data in [10]. To select an optimal route with appropriate actions, Markov Decision Process (MDP) is used to maximize the associated revenue in [11]. The optimal policy of MDP is determined to improve the taxi driver’s service strategy. The method of MDP is significantly extended in this paper to consider the constraints of EVs, such as battery capacity and locations of charging stations. Our preliminary study [12] uses a simplified model, whereas this paper presents a more realistic model and a more extensive analysis.

For EVs, limited driving range is a barrier preventing wide adoption. Therefore, the estimation of driving range for EVs has been studied in a number of research papers. The driving range of EVs is highly affected by driving speed and motor efficiency. A black-box model is widely used in the literature to predict the energy consumption of EVs and plug-in hybrid EVs (PHEVs) [13, 14]. Such a black-box model is used in this paper to estimate the energy consumption of electric taxis.

There are other studies that investigated the viability of deploying electric taxis. For example, the return on investment (ROI) for taxi companies transitioning to EVs was studied in [15], which considers the mobility trace of yellow cabs in San Francisco. The prior studies usually assumed that electric taxi drivers will adopt the same service strategies as driving a conventional ICE taxi. On the contrary, our study allows distinctive optimized service strategies for electric taxi drivers, taking into account that EVs have different operating constraints than conventional ICE vehicles.

### Ii-B New York City Taxi Trip Dataset

We describe the taxi trip dataset of New York City (NYC) of 2013 that is used in our study. In the following, we list the attributes of dataset that are used in our study. For each data record (i.e., a trip), it is composed of following attributes:

• Taxi ID (also known as medallion ID)

• Trip distance and duration

• Times of pick-ups and drop-offs of passengers

• GPS locations of pick-ups and drop-offs of passengers

We summarize the information of taxi trip dataset in Table I.

The numbers of taxi trips of NYC dataset on different days of 2013 are depicted in Fig. 0(a). There are about 450K trips per day and the average trip distance is around 4.2 km. Fig. 0(b) displays the pick-up locations on January 16 at 8-9 AM. The k-means algorithm is employed to cluster the pick-up locations by 200 clusters. The sizes of circles indicate the number of pick-up locations. We observe most of pick-ups occur in Midtown Manhattan. Finally, Fig. 0(c) displays the locations of charging stations in NYC [16] that potentially recharge electric taxis.

## Iii Markov Decision Process Model

In this work, we extend the Markov Decision Process (MDP) framework in [11] to model the computerized service strategy of an electric taxi. MDP facilitates the formulation of computerized taxi service strategies, which can be implemented in a recommender system for taxi drivers. In general, a MDP comprises of a set of states and a set of possible actions at each state. Each action transfers the current state to a new state with a probability and a reward. The objective is to find the optimal actions in the corresponding states that maximize the expected total reward.

### Iii-a States and Actions

First, we explain the states and actions of the MDP in our setting. A state for an electric taxi is described by three parameters: current time, current location and battery state, as explained as follows.

• Current Time: We consider discrete timeslots. One minute is used as the interval of a timeslot.

• Current Location: We consider the locations represented by the nearest junctions, instead of the absolute locations. A road network is constructed using OpenStreetMap (OSM) junction data. Each pick-up or drop-off location is assigned to the nearest junction in OSM. Let be the set of all junctions.

• Battery State: We consider discrete levels of state-of-charge of battery of the electric taxi. The feasible battery state should be within the range .

We denote the location of a taxi at time by , and the battery state by .

The allowable actions at the current junction are the neighbors of the junction in the road network, and the recharging duration, if the electric taxi is subject to recharging at this junction. We denote an action from junction to junction with recharging duration at by , where and are neighbors in the road network.

### Iii-B State Transition and Objective Function

The basic idea of the MDP for computerized taxi service strategy is illustrated in Fig. 2. Assuming the current location is , action is taken. The next location will be after recharging for a duration at . When entering junction , there is a probability of not picking up any passenger, after which the taxi driver will make another action. On the other hand, there is a probability of picking up a passenger, with a random destination. The taxi driver will decide if the current battery state is sufficient to deliver the passenger to the respective destination, or the trip is discarded. The detailed descriptions of MDP are provided in the following.

First, we define several parameters for the MDP as follows.

• : The probability of successfully picking up a passenger at junction at time .

• : The probability of a passenger commuting from junction to junction at time .

• : The required time (mins) for executing action .

• : The required traveling time (mins) from junction to junction at time .

• : The required energy consumption (kW) from junction to junction at time .

• : The net revenue of transporting passengers from junction to junction , which is calculated based on the fare rule of New York taxi and the respective energy costs. There are various surcharges in different times and days, and hence, the net revenue is time-dependent.

• : The energy cost from junction to at time .

Note that some of these parameters (e.g., , , , ) can be estimated from the taxi trip dataset, which will be discussed in the subsequent section.

Next, we formulate a recurrent equation for describing the MDP, namely, Eqn. (1) (as illustrated in Fig. 3).

If the current location is , after action has been taken, the next location will be , where . The required time of the action is computed as follows:

1. If recharging duration , the taxi directly goes to junction . The required time of action is given by

 Tat(A)=Ttt(i,j)
2. If recharging duration , before driving to junction , the taxi first goes to the nearest charging station to recharge the electric taxi. The required traveling time is to travel to charging station . Then the electric taxi is recharged for duration and next goes from charging station to junction , whose required traveling time is . Thus, the total required time of action is given by

 Tat(A)=Ttt(i,r(i))+τ+Ttt\raisebox0.4pt\scalebox.6+Ttt(i,r(i))\raisebox0.4pt\scalebox.6+τ(r(i),j)

Note that if the state-of-charge of battery is insufficient, certain actions are infeasible (e.g., driving to a distant location to pick up passengers). Therefore, an action needs to consider the required energy consumption that can be supported by the current battery state. If the current battery state is , after action has been taken, the new battery state at will be , where is the charging rate, and .

At junction , there are three possible state transitions:

1. The taxi successfully picks up a passenger at junction (say, with destination ) and is sufficient to deliver the passenger to junction and then to the nearest charging station , if necessary. For each , the probability is , subject to the constraint , such that the resultant battery state is always larger than the minimal . Hence, denote the probability of picking up a passenger by probability , where is the probability that the destination of passenger is reachable for the taxi under battery constraint, and is computed by

 Pst′(j)=∑k∈N:Eet(j,k)+Eet(k,r(k))+B––≤b′Pdt′(j,k)
2. The taxi successfully picks up a passenger at junction , but is insufficient to deliver the passenger to junction and then to the nearest charging station . The total probability of such a case is

 ∑k∈N:Eet(j,k)\raisebox0.4pt\scalebox.6+Eet(k,r(k))\raisebox0.4pt\scalebox.6+B––>b′Ppt′(j)Pdt′(j,k)
3. The taxi cannot successfully pick up a passenger at junction . The probability is .

Note that the probability that the taxi does not deliver any passenger (including (C2) and (C3)) is . The complement of , i.e., is given by

 1−Pst′(j)≜∑k∈N:Eet(j,k)+Eet(k,r(k))+B––>b′Pdt′(j,k)

Hence, we obtain

 1−Ppt′(j)⋅Pst′(j) = 1−Ppt′(j)+Ppt′(j)⋅(1−Pst′(j)) = 1−Ppt′(j)+∑k∈N:Eet(j,k)\raisebox0.4pt\scalebox.6+Eet(k,r(k))\raisebox0.4pt\scalebox.6+B––>b′Ppt′(j)Pdt′(j,k)

For (C1), the taxi driver will receive a fare of amount , and the next location of the taxi becomes . For (C2) and (C3), the taxi driver will not receive any fare, and will decide to drive to another location or possibly recharge the taxi.

The objective of the MDP is to maximize the total expected net revenue. Note that the net revenue of the action is the received fare minus the energy cost of the action. The expected net revenue for an action at state is denoted by , which can be computed recurrently in Eqn. (1), where

• and .

• is the maximal expected net revenue in state over all possible actions.

• is the energy cost, as computed as follows:

1. If recharging duration , the taxi directly goes to junction . The energy cost is , where is the unit price, such that 20 cent/kWh for electricity and 2.5 USD$/gallon for gasoline. 2. If recharging duration , the taxi goes to the nearest charging station to recharge the electric taxi at charging rate . The energy cost of the action is given by  Uat(i,j)=(Eet(i,r(i))+Eet\raisebox0.4pt\scalebox.6+Ttt(i,r(i))\raisebox0.4pt\scalebox.6+τ(r(i),k)+τ⋅C)⋅U We seek to devise an optimal policy for the MDP that maximizes the expected net revenue:  π(t,S(t),B(t))= argmaxAR∗[t,S(t),B(t),A] (2) To obtain the optimal policy for the MDP, one can use dynamic programming. The dynamic programming algorithm starts from the last timeslot and then works backwards to the beginning timeslot. For example, to solve the optimal policy for a morning shift, the algorithm starts to solve the maximal expected net revenue at the end of shift, and works backwards. ## Iv Markov Decision Process Parameters In this section, we estimate several parameters of MDP (e.g., , , , ) from NYC taxi trip dataset. ### Iv-a Driving Speed Network First, we construct a driving speed network from the NYC taxi trip dataset, for the following purposes: 1. To estimate the traveling time from each junction to the nearest charging station. 2. To estimate the energy consumption of a taxi for a trip. Note that traveling time and driving speed are time-dependent parameters, since they are highly affected by traffic condition, which is estimated from historical trip data. For example, the traveling time between the same pair of junction and junction will be higher in office hours and much lower at midnight. The first step of constructing the driving speed network is to determine the driving path of a taxi. Spatialite [17] is used to calculate the shortest path for each pair of pick-up and drop-off locations. Spatialite utilizes OpenStreetMap (OSM) data. A resulting path comprises a list of edges (i.e., segments) described by two junctions. We then compare the recorded trip distance in the taxi trip dataset to the computed shortest path distance. If the difference is greater than 300 meters, the record is discarded since the driver is likely to take other route. For each computed path, the segments of a path are labeled with the average speed using recorded traveling time and distance. We can obtain the average speed for each taxi trip record. Each segment has several average speeds by different trips. We select the highest speed to represent the driving speed of the segment, since this is usually the speed with minimal obstacles. Driving speed networks at different times are visualized in Fig. 4. We observe there is relatively more congested traffic in 9 to 10 AM or 4 to 5 PM. Given the driving speed network, we can estimate the driving time from the network. We can also estimate the idling time of each trip by subtracting the estimated driving time from the recorded traveling time. The detailed steps for calculating the idling time are described as follows: 1. Average traveling time : There may be several trips start from junction to junction . However, their traveling times may be slightly different. We average the traveling time of these trips. 2. Driving time : The shortest path from junction to junction is determined by Spatialite. Then, the driving time in each segment is computed by its distance and the driving speed from the driving speed network. 3. Idling time : The idling time of a trip is obtained by subtracting the driving time from the average traveling time, To understand traffic conditions, define a metric called the idling ratio of each source and destination pair by . Denote by the median of idling ratio between time and in the distribution. Fig. 5 shows the distribution of idling ratios. We observe that the median is 56% for 9-10 AM, but only 33% for 3-4 AM due to less traffic. ### Iv-B Passenger Pick-up Probability Ppt(i) The passenger pick-up probability describes the chance of a taxi driver can pick up a passenger at junction at time . Following the idea in [11], we use the numbers of taxis and pick-ups around a particular junction to calculate the pick-up probability in mins. First, denote the number of pick-ups at junction from time to by . To estimate the number of taxis around junction in mins, denote the number of drop-offs from time to within kilometers distance from junction by . Assuming the taxis are vacant after dropping off the passengers and are roaming immediately around junction within kilometers in mins. Thus, pick-up probability can be estimated by  Ppt(i)=Npt:t\raisebox0.4pt\scalebox.6+τ(i)Npt:t\raisebox0.4pt\scalebox.6+τ(i)+Ndt\raisebox0.4pt\scalebox.8−τ:t\raisebox0.4pt\scalebox.6+τ(i) (3) The suitable parameters and can be obtained from the historical taxi trip dataset. For example, can be estimated by the average inter-pick-up duration, the time interval between consecutive pick-ups of a taxi. Using the average driving speed, can be estimated by the reachable distance in the average inter-pick-up duration. Fig. 5(a) depicts the average inter-pick-up durations for weekdays and weekends. We observe that it takes more time to find a passenger at 4 AM on weekday and at 7 AM at weekends. Fig. 5(b) depicts the respective reachable distance in inter-pick-up duration. In the following study, we set time-varying and according to the average inter-pick-up duration and the respective reachable distance from taxi trip dataset for each hour. ### Iv-C Passenger Destination Probability Pdt(i,j) The passenger destination probability describes the chance that a passenger needs to commute from one junction to another junction. This probability is time-dependent, because, for example, passengers are more likely to commute from living places to offices in working hours. One-hour timeslot is used to estimate passenger destination probability from taxi trip dataset. In each timeslot, we obtain the number of trips between each pair of source and destination, and then is normalized by the total number of trips. Denote the destination probability from junction to junction at time by . Denote the number of pick-ups at junction by , and the number of corresponding drop-offs at junction by . The passenger destination probability from junction to junction is estimated by  Pdt(i,j)=Ndt(i,j)Npt(i) (4) ### Iv-D Energy Consumption Eet(i,j) We use a black-box approach to estimate the energy consumption for EVs, based on the work in [13, 14]. The energy consumption model is based on the average driving speed and auxiliary loading. The total energy consumption can be decomposed into moving energy consumption and auxiliary loading energy consumption, which can be estimated by multivariate linear models (see [13, 14] for details):  Eet(i,j)= Emvt(i,j)+Eaxt(i,j) (5) Emvt(i,j)= β(α1vt(i,j)2+α2vt(i,j)+α3)⋅D(i,j) (6) Eaxt(i,j)= ℓtTtt(i,j)/60 (7) where is the driving speed between junctions and at time , obtained from driving speed network. is the driving distance between junction and junction . The auxiliary loading is highly affected by weather temperatures which is time variant. The auxiliary loading can be estimated from the historic weather temperature and the average auxiliary loading measurements at particular temperatures444 See [18] for an empirical measurement study. According to New York historical weather and suggested power load, the average auxiliary loading is between 1.5 to 1 kW. The parameter represents aggressiveness factor to capture the driving behavior. Driver behavior has an impact on the energy consumption of vehicles, as driving range will be significantly decreased by aggressive acceleration and deceleration. Mild driving behavior can save up to 30% to 40% energy consumption comparing with aggressive driving behavior [19, 20]. Therefore, we define three classes of driving behaviors: i) mild drivers (), ii) normal drivers (), and iii) aggressive drivers (). Based on previous work [13], the parameters of energy consumption model for Nissan Leaf are set as . ### Iv-E Energy Consumption Eet(i,r(i)) The electric taxis should arrive at each junction with certain battery state, which can guarantee them to reach the nearest charging stations. The locations of NYC charging station data are obtained from [16]. We consider the charging stations for general EVs. Note that there are other charging stations requiring memberships, and are not considered in this study. To estimate the minimum required energy consumption to the nearest charging station at junction at time , the minimum distance between the junction and the nearest charging station is obtained as follows: 1. Spatialite is used to find the nearest charging station for junction in the road network by the shortest distance. 2. The shortest distance is converted into the required driving time based on the driving speed network. 3. The median idling ratio is used to estimate the idling time at time . 4. Given the driving speed network and idling time, the energy consumption is obtained by Eqn. (5). ### Iv-F Taxi Net Revenue Ft(i,j) The fares are calculated according to the rules for New York taxis. Since there are different kinds of surcharge based on times and days, the fare is time-dependent, because of various surcharges555The initial charge is$2.50. Plus 50 cents per 1/5 mile or 50 cents per 60 seconds in slow traffic or when the taxi is stopped. 50-cent MTA State Surcharge is required for all trips that end in New York City. Another 30-cent Improvement Surcharge is required. Daily 50-cent surcharge is required from 8pm to 6am. $1 surcharge is required from 4pm to 8pm on weekdays, excluding holidays. Toll fees are ignored since the taxi driver will not receive any revenue from tolls. . The net revenue of a trip can be calculated by deducting fuel/electricity cost from the revenue. Therefore, the net revenue of a trip from junction to junction at time is  Ft(i,j)=FRt(i,j)−Eet(i,j)⋅U (8) where is the recorded amount of base fare plus the surcharges from to at time , and is the unit price. ### Iv-G Charging Rate C Two types of charging rates are considered in this study: mode 3 charging and (direct current) fast charging. Currently, mode 3 charging is more common than fast charging. The charging power of mode 3 charging is 6.6 kW (e.g., for Nissan Leaf), whereas the charge power of fast charging is 50 kW. ## V Evaluation based on NYC Taxi Trip Dataset In this section, we apply the MDP to optimize computerized taxi service strategies and evaluate the improvement in net revenues using NYC taxi trip dataset. We first examine the net revenue of conventional ICE taxis and improvement by MDP under a basic setting with complete knowledge of taxi trip information for one single taxi, which represents the best-case scenario. Next, we study a similar setting for electric taxis. Then, we relax the basic setting by more realistic settings: (1) using only historical data as training dataset, (2) an extension to multiple taxis, and (3) considering different driving behavior. ### V-a Basic Setting of ICE Taxi Setting: This section presents an evaluation study based on one-day data of January 9 2013 in the NYC taxi trip dataset. In Sec. V-G, an evaluation using a whole year’s data will be presented. First, we note that the NYC taxi trip dataset has only records of trip distance and duration, and pick-up and drop-off information. There is no full mobility data trace of taxis, in particular when the taxis are roaming without passengers. It is difficult to estimate the exact total travel distance (i.e., including roaming and passenger delivery). Hence, we estimate a lower bound for the total travel distance by connecting the shortest path between a drop-off location and a subsequent pick-up location. As such, we obtain an optimistic estimation of net revenue (i.e., revenue minus fuel cost) by the lower bound of total travel distance. We consider a basic setting, such that the optimal policy of MDP is employed in one single taxi, based on complete knowledge of taxi trip information on the same day from the dataset. In Sec. V-D, using historical data for prediction and multiple taxis will be presented. Note that it is challenging to evaluate the exact performance of modified taxi behavior using historical dataset. For example, when a passenger is picked up by a taxi with modified behavior, who was originally picked up by another taxi in the dataset, it is not clear how original taxi should behave in the evaluation. Therefore, we consider a simple approach of evaluation, such that other taxis always follow the recorded trajectories as in the dataset, no matter picking up the supposed passengers of the dataset or not. Although this will not attain absolute accuracy, this is a simple approach without the knowledge of the disrupted behavior of other taxis in real life. Note that if we only modify the behavior of a small number of taxis, then this simple approach will give rather accurate evaluation. Also, refueling is not considered for ICE taxis, because ICE taxi drivers normally fill up the gas tanks between the shifts666Most NYC taxis operate in two shifts per day. Each normally lasts for 12 hours. More than 40% of taxi drivers change shifts at around 5 AM or 5 PM. In this study, we assume that a morning shift is from 5 AM to 5 PM, whereas an evening shift is from 5 PM to 5 AM.. Then the MDP model for an ICE taxi is identical to that of an electric taxi in Sec. III, but without recharging decisions. Observations: Based on the NYC taxi trip dataset, Fig. 6(a) shows the distribution of (optimistically) estimated net revenues from all the trips (with 11746 taxi drivers) for morning shifts. The blue dashed line indicates the median of taxi drivers. We observe that 50% drivers earn above USD$223. The red dashed line indicates the expected estimated net revenues when a taxi driver follows the optimal policy of MDP assuming 12 working hours. This taxi driver is expected to earn USD$440. Therefore, optimizing the taxi service strategy enables a taxi driver to earn at most among the top 0.01%. Fig. 6(b) shows the delivery distances of passengers per taxi drivers. More than 50% taxis travel more than 79 kilometers for passenger delivery. By optimizing taxi service strategy, a taxi driver is expected to travel up to 155 kilometers for passenger delivery. Fig. 6(c)-6(d) show the distributions for evening shifts. The median net revenue is smaller than that of morning shifts because of shorter working hours (Fig. 7(c)). Also, we observe that the median delivery distances of passengers for the evening shift is similar to that of morning shifts. The computation of expected net revenue of the optimal policy of MDP assumes 12 working hours. The distribution of working hours for morning shifts is shown in Fig. 7(a). We observe that most of drivers work less than 12 hours, and the median working hours on the day is 8.7 hours. For a normalized comparison, we also study the hourly net revenues, instead of net revenues per shifts. The distribution of estimated hourly net revenues for morning shifts is presented in Fig. 7(b). We observe that the hourly net revenue of MDP driver is the top 5% in both shifts. We notice that higher hourly net revenue is due to shorter working hour with long trips. Fig. 7(c) shows that taxi drivers have shorter working hours for evening shifts, but their hourly net revenues, because of extra surcharge for evening shifts. Ramifications: Optimizing taxi service strategies can significantly improve the profitability of taxi drivers. Our evaluation based on a basic setting shows that optimized service strategy for a conventional ICE taxi can earn at most among the top 0.1%. Although this represents the best-case evaluation, the subsequent sections will relax to more realistic settings, and yet still show a considerable advantage. ### V-B Basic Setting of Electric Taxi Setting: In this section, we apply a similar basic evaluation based on the data of January 9 2013 to electric taxis. We employ the energy consumption model of Nissan Leaf [13]. In fact, the most determining factor of performance of EVs is battery capacity. Hence, the energy consumption model of Nissan Leaf suffices to provide a generic estimation of energy consumption of electric taxis. Usually, the EVs will not allowed to be overly re/discharged to protect the battery. Therefore, we set the available battery level from 5% to 95% of the capacity. We consider typical settings of battery capacity for EVs (e.g., 30 kWh, 50 kWh, 70 kWh). Each setting can affect the recharging decisions and net revenues considerably. Observations: Figs. 8(a)-8(b) show the estimated net revenues for electric taxis under different battery capacities. The blue bars represent the net revenues using fast charging, while the red bars represent those using mode 3 charging. We observe that electric taxis equipped with 50 kWh battery can make comparable net revenues with traditional ICE taxis using fast charging for morning shifts. Note that in general smaller batteries require more frequent recharging, which can reduce revenue. EVs with smaller batteries are cheaper. The net revenue gap between using fast charging and mode 3 charging is smaller when the battery capacity increases. The estimated net revenue reaches USD$438 when battery capacity is above 50 kWh. The net revenue is higher than that of ICE taxis using optimized service strategies (i.e., USD$426 benchmark), because electricity cost is cheaper. Figs. 8(c)-8(d) show the driving distances and energy consumptions under different battery capacities. The blue bordered bar represent the driving distances using fast charging while red bordered bars represent those using mode 3 charging. The green portions represent the amount of charging energy received from charging stations, while gray portions represent the amount from initial batteries. We observe that the total driving distance is around 242 kilometers without recharging for morning shifts. At night, the electric taxis are expected to drive longer distances because of less traffic. The required energy consumption without charging for morning shifts is 43 kWh, which can be provided by 50 kWh battery (i.e., 45 kWh usable capacity) without recharging. For evening shifts, the required energy consumption increases to 45.1 kWh. Therefore, electric taxis with 50 kWh battery are then required to recharge during shifts. Ramifications: Optimizing taxi service strategies for electric taxis can improve the profitability of taxi drivers. But the effect depends on the battery capacity. With more capacity (e.g., 50 kWh, 70kWh), the taxi driver can earn comparable net revenue with the one of ICE taxi using optimized service strategy. It is because that recharging will incur inefficiency for electric taxis with a low capacity battery. ### V-C Using Historical Data for Prediction Setting: The previous basic evaluation of net revenues is based on MDP using the complete knowledge, which requires knowing the pick-up demands and locations in a-priori manner. However, complete information is difficult to obtain in practice. A more practical approach is to use only historical data as training dataset for MDP, and then obtain an optimal policy as a heuristic for other days. In the following, we use the optimal policy of MDP obtained from 6th January to 12th January (i.e., the first week after 1st January) as training data. Then we employ the policy to all morning shifts in the year in the evaluation. Observations: Fig. 10 shows the estimated net revenue using one-day training data on different days of a week. We observe that the highest net revenue occurs on Friday while the lowest net revenue occurs on Sunday, because of more passengers on Fridays. The figures also show the benchmark for ICE taxis using historical data (i.e., gray band). In particular, Fig. 9(a) shows the net revenue of 70 kWh battery capacity using different dates in January as training data. We observe that the training data from 6th January performs the best while training data from 8th January performs the worst. A taxi driver can receive 7.5% higher net revenue using 6th January data. Fig. 9(c) shows the net revenues with the 30 kWh battery capacity using training data from 6th January. We observe that 12% higher net revenue can be obtained using fast charging than using mode 3 charging. Electric taxis with 70 kWh battery capacity can obtain 2.8% higher than 30 kWh battery capacity using fast charging. Ramifications: Using historical data for prediction, instead of complete knowledge, will inevitably reduce the effectiveness. However, this creates a similar effect on ICE taxis that also use historical data. Hence, optimizing taxi service strategies for electric taxis using historical data still achieves comparable net revenues as that of ICE taxis. ### V-D Multiple Electric Taxis Setting: The optimal policy from MDP has been previously employed in one single taxi. Next, we employ the optimal policy to multiple taxis. The idea is to allow multiple electric taxis adopt the optimal policy from MDP, while ensuring the number of taxis being sent to each location is constrained. Otherwise, this leads to over-provision of taxis at certain locations. This simple constraint allows us to decouple the individual MDP decisions. Otherwise, considering a large complex problem will be intractable. In practice, we may display the potential net revenue of each junction to the taxi drivers. The junction will become less desirable, when the number of taxis currently present exceeds a certain threshold. Hence, they would not prefer to go to the junction. We first empirically study the distribution of number of taxis at all the junctions over time from the dataset. We then set of limit of the number of taxis at each junction according to the mean number of taxis at each junction from the dataset. To satisfy the constraint, some electric taxis would need to follow the second-best decisions in the optimal policy. Each taxi state is initialized by the junction and the time according to the dataset. The state of each taxi is tracked and the passenger pick-up probability is recomputed using Eqn. (3). We use the optimal policy based on the data of 6th January. Observations: Fig. 10(a) displays the histogram of number of taxis in a junction. We observe that the number of taxis in each junction is less than 7 by 99% of time. We set of limit of the number of taxis at each junction according to the mean number of taxis at each junction from the dataset. Fig. 10(b) shows the net revenues of different numbers of electric taxis using the optimal policy of MDP on 9 Jan. We observe that the net revenue drops to$USD 350 when 1000 electric taxis use the optimal policy of MDP. The red bar indicates the total driving distance of the taxis and blue bar indicates the passenger delivery distance. We observe that the delivery distance drops but the total driving distance remains relatively steady. This implies that the increase of roaming distance is due to a lower passenger pick-up probability. Fig. 10(c) shows the average net revenue of multiple taxis over entire year of 2013. We observe that the highest net revenue occurs on Fridays while the lowest occurs on Sundays. We also observe that the net revenue is less affected by the number of taxis when mode 3 charging is used. This is because that the electric taxis require frequent recharging, which may result in less available taxis, and hence, a higher pick-up probability.

Ramifications: If the optimal policy of MDP is deployed up to 1000 electric taxis, then the net revenues will decrease, as a result of diminishing advantage of computerized service strategies. These 1000 taxi drivers can still earn as top 1.7% among traditional taxi drivers without computerized service strategies.

### V-E Considering Driving Behavior

Setting: Driving behavior plays an important role in energy consumption of vehicles. Aggressive driving behavior results in more energy consumption. Furthermore, higher energy consumption rate induces more frequent recharging of EVs, which reduces the net revenues of the taxi drivers. We study three classes of driving behaviors: i) mild drivers (), ii) normal drivers (), and iii) aggressive drivers ().

Observations: Fig. 12 shows the estimated net revenues of different driving behaviors for morning shifts. Fig. 11(a) shows the estimated net revenues of different driving behaviors using mode 3 charging. Mild drivers can receive 14% higher net revenue than aggressive drivers when driving 30 kWh Leaf using mode 3 charging. However, the net revenue is less affected by different drivers when the battery capacity is sufficiently large to eliminate recharging during a shift. Fig. 11(b) shows the estimated net revenues using fast charging. We observe that the net revenue is also less affected by different driving behaviors because of shorter recharging duration. Fig. 12 shows the energy consumption of different driving behaviors. We observe that aggressive drivers consume around 11 kWh more energy than mild drivers.

Ramifications: Although the aggressive drivers consumes 20% more energy which only results in $USD2.2 difference for morning shifts. The result shows that the driving behavior only has a higher impact on the net revenue when the battery capacity is insufficient to eliminate recharging during a shift. ### V-F Considering Different Gas Prices Setting: To complete the study of viability of electric taxis, we provide a study of ICE taxis’ net revenue under different gas prices. Note that the current gas price in USA is around USD$2.5 per gallon, while the current gas price in China is around 7.2 RMB per liter, which is equivalent to $4.5 USD per gallon. We analyze the outcomes of three different gas prices (i.e.,$2.5 USD/G, $3.5 USD/G,$4.5 USD/G) considering the optimal policy of MDP for an ICE taxi.

Observations: Fig. 13 compares the annual net revenues of ICE taxi under different gas prices, with that of electric taxis using different charging options. We observe that the annual net revenue with gas price $2.5 USD/G (i.e., the leftmost bar) is slightly higher (about USD$ 4000 higher) than that with gas price $4.5 USD/G. We also observe that the comparable net revenue can be achieved by 30 kWh EV with fast charging when gas price increases to$4.5 USD/G. However, the annual net revenue of 30 kWh EV with mode 3 charging is much lower (about 14% lower), even when the gas price is high.

Ramifications: We observe that when the gas price increases, ICE taxi becomes a less attractive option since its net revenue decreases. The net revenue of 70 kWh EV is around 3% higher than ICE taxi when gas price is $2.5 USD/G, while it is around 6.6% higher than ICE taxi when gas price increases to$4.5 USD/G.

### V-G Annual Evaluation of NYC Taxi Trip Dataset

#### V-G1 Net Revenue Evaluation

Setting: We employ the optimal policy from 6th January to different numbers of electric taxis and estimate the annual net revenues. Fig. 15 shows the distribution of annual working hours, we observe that the median annual working hour is around 1800 hours, but many drivers work more than 4300 hours, equivalent to working almost 12 hours a day. Therefore, we consider taxi drivers working every morning shift (i.e., 4380 working hours) to estimate their net revenues.

Observations: The right figure in Fig. 15 shows the estimated net revenues of different taxi drivers using the optimal policy of MDP. There are some observations:

• The case of one electric taxi driver using the optimal policy of MDP can earn 3% higher than that of one ICE taxi driver.

• The average net revenue of case of 1000 electric taxis with 70 kWh battery is ranked top 0.07% among traditional taxi drivers without computerized service strategy.

• The average net revenue of case of 1000 electric taxis with 30 kWh battery using mode 3 charging is ranked top 0.4% among traditional taxi drivers without computerized service strategy.

The results shows that the optimal policy of MDP can enable electric taxi drivers to make comparable revenues as traditional taxi drivers.

#### V-G2 Carbon Emission Evaluation

Setting: Besides of net revenues as economic motivation, an important benefit is the reduction of carbon emission by switching from ICE taxis to electric taxis. Although electric taxis do not produce tailpipe emissions, the electricity grid to recharge the battery may still produce emissions. In this section, we estimate the CO emission of electric taxis, as compared with ICE taxis, with computerized service strategy optimization. The CO emission factors of electricity and gasoline are obtained from eGrid of Long Island [2]:

• Emission factor of electricity: 0.7007 kg/kWh

• Emission factor of gasoline: 2.348 kg/liter

Observations: We consider taxis working in all shifts. Fig. 13(a) shows the daily energy consumption of 1000 taxis for morning shifts, while Fig. 13(b) shows the daily energy consumption for night shifts. We use miles per gallon gasoline equivalent to convert the consumed gasoline to kWh (i.e., 1 gallon of gasoline equals to 33.7 kWh). Fig. 13(c) shows the annual energy consumption of different numbers of electric taxis. We observe that ICE taxis consume around 4 times more energy than electric taxis. Fig. 13(d) shows the corresponding CO emissions of different numbers of electric taxis. We observe that up to 15 thousand metric tons CO (equal to 1560 home’s energy use for one year) can be saved by replacing 1000 ICE taxis by electric taxis.

## Vi Conclusion

In this paper, we employ Markov Decision Process to model computerized taxi service strategy and optimize the strategy for taxi drivers considering electric taxi operational constraints. We evaluate the effectiveness of the optimal policy of Markov Decision Process using a big data study of real-world taxi trips in New York City. The optimal policy can be implemented in an intelligent recommender system for taxi drivers. This becomes more viable especially due to the advent of autonomous vehicles. Our evaluation shows that computerized service strategy optimization allows electric taxi drivers to earn comparable net revenues as ICE drivers, who also employ computerized service strategy optimization, with at least 50 kWh battery capacity. Hence, this sheds light on the viability of electric taxis.

## References

• [1] NYC Taxi & Limousine Commission. (2014) Taxicab Fact Book.
• [2] USA Environmental Protection Agency. (2017) Greenhouse Gases Equivalences Calculator.
• [3] South China Morning Post. (2017) After Hong Kong Failure, China’s BYD Joins Singapore Launch.
• [4] E. Wilhelm, J. Siegel, S. Mayer, L. Sadamori, S. Dsouza, C.-K. Chau, and S. Sarma, “Cloudthink: A scalable secure platform for mirroring transportation systems in the cloud,” Transport, vol. 30, no. 3, 2015.
• [5] D. Zhang, L. Sun, B. Li, C. Chen, G. Pan, S. Li, and Z. Wu, “Understanding Taxi Service Strategies From Taxi GPS Traces,” IEEE Trans. Intell. Transp. Syst., vol. 16, pp. 123–135, 2015.
• [6] S. Liu, Y. Yue, and R. Krishnan, “Non-Myopic Adaptive Route Planning in Uncertain Congestion Environments,” IEEE Trans. Knowledge and Data Engineering, vol. 27, pp. 2438 – 2451, 2015.
• [7] S. Liu and S. Wang, “Trajectory Community Discovery and Recommendation by Multi-Source Diffusion Modeling,” IEEE Trans. Knowledge and Data Engineering, vol. 29, pp. 898–911, 2017.
• [8] J. Zhao, Q. Qu, F. Zhang, C. Xu, and S. Liu, “Spatio-Temporal Analysis of Passenger Travel Patterns in Massive Smart Card Data,” IEEE Trans. Intell. Transp. Syst., vol. 18, pp. 3135–3146, 2017.
• [9] J. Yuan, Y. Zheng, L. Zhang, X. Xie, and G. Sun, “Where to Find My Next Passenger?” in ACM Int. Conf. Ubiquitous Computing (UbiComp), 2011.
• [10] M. Qu, H. Zhu, J. Liu, G. Liu, and H. Xiong, “A Cost-effective Recommender System for Taxi Drivers,” in ACM Int. Conf. Knowledge Discovery and Data Mining (SIGKDD), 2014.
• [11] H. Rong, X. Zhou, C. Yang, Z. Shafiq, and A. Liu, “The Rich and the Poor: A Markov Decision Process Approach to Optimizing Taxi Driver Revenue Efficiency,” in ACM Int. Conf. Information and Knowledge Management (CIKM), 2016.
• [12] C.-M. Tseng and C.-K. Chau, “Viability Analysis of Electric Taxis Using New York City Dataset ,” in ACM Workshop on Electric Vehicle Systems, Data and Applications (EVSys), 2017.
• [13] ——, “Personalized Prediction of Vehicle Energy Consumption based on Participatory Sensing,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 11, pp. 3103–3113, 2017.
• [14] C.-K. Chau, K. Elbassioni, and C.-M. Tseng, “Drive Mode Optimization and Path Planning for Plug-in Hybrid Electric Vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 12, pp. 3421–3432, 2017.
• [15] T. Carpenter, A. R. Curtis, and S. Keshav, “The Return On Investment for Taxi Companies Transitioning to Electric Cehicles - A Case Study in San Francisco,” Journal Transportation, vol. 41, pp. 785–818, 2014.
• [16] New York government. (2017) Electric Vehicle Charging Stations in New York.
• [17] A. Furieri. (2017) The Gaia-SINS federated projects.
• [18] FleetCarma. (2014) Real-world range ramifications: heating and air condition.
• [19] C. Bingham, C. Walsh, and S. Carroll, “Impact of Driving Characteristics on Electric Vehicle Energy Consumption and Range,” IET Intelligent Transport Systems, vol. 6, no. 1, pp. 29–35, 2012.
• [20] L. Feng and B. Chen, “Study the Impact of Driver’s Behavior on the Energy Efficiency of Hybrid Electric Vehicles,” in ASME Intl. Design Engineering Technical Conference, 2013.

## Acknowledgment

The authors would like to thank Srinivasan Keshav and Sgouris Sgouridis for helpful suggestions and discussions.

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters