EU Long-term Dataset with Multiple Sensors for Autonomous Driving

EU Long-term Dataset with Multiple Sensors for Autonomous Driving


The field of autonomous driving has grown tremendously over the past few years, along with the rapid progress in sensor technology. One of the major purposes of using sensors is to provide environment perception for vehicle understanding, learning and reasoning, and ultimately interacting with the environment. In this paper, we introduce a multisensor framework allowing vehicle to perceive its surroundings and locate itself in a more efficient and accurate way. Our framework integrates eleven heterogeneous sensors including various cameras and lidars, a radar, an IMU (Inertial Measurement Unit), and a GPS/RTK (Global Positioning System / Real-Time Kinematic), while exploits a ROS (Robot Operating System) based software to process the sensory data. In addition, we present a new dataset ( for autonomous driving captured many new research challenges (e.g. highly dynamic environment), and especially for long-term autonomy (e.g. creating and maintaining maps), collected with our instrumented vehicle, publicly available to the community.

I Introduction

Both academic research and industrial innovation into autonomous driving has seen tremendous growth in the past few years and is expected to continue to grow rapidly in the coming years. This can be explained by two factors including, the rapid development of hardware (e.g. sensors and computers) and software (e.g. algorithms and systems), and the needs for travel safety, efficiency, and low-cost along with the development of human society.

A general framework for autonomous navigation of unmanned vehicle consists of four modules, including sensors, perception and localization, path planning and decision making, as well as motion control. It’s typically to have vehicles answer three questions: “Where am I?”, “What’s around me?”, and “What should I do?”. As shown in Fig. 1, the vehicle acquires the external environmental data (e.g. image, distance and velocity of object) and self-measurements (e.g. position, orientation, velocity and odometry) through various sensors. Sensory data are then delivered to the perception and localization module, help the vehicle understand its surroundings and localize itself in a pre-built map. Moreover, the vehicle is expected to not only understand what happened but also what is going on around it [9], and it may simultaneously update the map with a description of the local environment for long-term autonomy [16, 8]. Afterwards, depending on the pose of the vehicle itself and other objects, a path is generated by the global planer and can be adjusted by the local planer according to the real-time circumstance. Then the motion control module will calculate motor parameters to execute the path and send commands to the actuators. Following the loop across these four components, the vehicle can navigate autonomously following a typical “see-think-act” cycle.

Fig. 1: A general multisensor based framework for a map based autonomous driving system.

Effective perception and localization are known as the most essential part of many modules for an autonomous vehicle to safely and reliably operating in our daily life. The former includes the measurement of internal (e.g. velocity and orientation of the vehicle) and external (e.g. human, object and traffic sign recognition) environmental information, while the latter mainly includes visual odometry / SLAM (Simultaneous Localization And Mapping), localization with a map, and place recognition / re-localization. These two tasks are closely related and both affected by the sensors used and the processing manner of the data they provide.

Nowadays, the heterogeneous sensing system is commonly used in the field of robotics and autonomous vehicles, in order to produce comprehensive environmental information. Commonly used sensors include various cameras, 2D/3D lidar (LIght Detection And Ranging), radar (RAdio Detection And Ranging), IMU (Inertial Measurement Unit), and GNSS (Global Navigation Satellite System). The combination use of these is mainly due to the fact that different sensors have different (physical) properties, and each category has its own pros and cons [21]. On the other hand, ROS (Robot Operating System) [13] has become the de facto standard platform for development of software in robotics, and today increasing numbers of researchers and industries develop autonomous vehicles software based on it. As an evidence, for example, seven emerging ROS-based autonomous driving systems were presented at ROSCon1 2017, while this number was zero in 2016.

In this paper, we report our progress in building an autonomous car at the University of Technology of Belfort-Montbéliard (UTBM) in France from September 2017, with a focus on the completed multisensor framework. Firstly, we introduce a variety of sensors used for the purpose of efficient perception and localization in autonomous driving, while illustrating the reason of choosing them, the installation positions, and some trade-offs we made in the system configuration. Second, we introduce a new dataset2 for autonomous driving, entirely based on ROS, recorded with our multisensor platform in both urban and suburban areas, where all the sensors are calibrated, data are approximately synchronized (i.e. at the software level, except the two 3D lidars which are synchronized at the hardware level by communicating with positioning satellites), and the ground truth for vehicle localization is provided. This dataset includes many new features for urban driving, such as highly dynamic environment (massive moving objects in vehicle odometry), sloping road, shared zone, construction bypass, aggressive driving, etc., and as it captures daily and seasonal changes, it is especially suitable for long-term vehicle autonomy research [11]. Additionally, we implemented the state-of-the-art methods as baselines for the lidar odometry benchmarking, with ground-truth trajectories recorded by GPS/RTK. Finally, we illustrate the proposed system characteristics via a horizontal comparison with other vehicle platforms and their related datasets.

Starting to work with the autonomous vehicle might be a challenge and time consuming. Because people have to face difficulties on the design, budgeting and cost control, and the implementation from the hardware (especially with various sensors) to the software level. The main purpose of this paper is also to summarize our experience and to help readers to quickly overcome similar issues. We hope these descriptions will give the community a practical reference.

Ii The Framework

So far, there is no almighty and perfect sensor, and they all have limitations and edge cases. For example, GNSS is extremely easy to navigate and works in all weather conditions, but its update frequency and accuracy are usually not enough to meet the requirements of autonomous driving. Also, buildings and infrastructures in the urban environment are likely to obstruct the signals, thereby leading the positioning failures in many daily scenes such as urban canyons, tunnels, and underground parking lots. Among visual and range sensors, the 3D lidar is generally very accurate and has a large field of view (FoV). However, the sparse and geometry data (i.e. point clouds) obtained from this kind of sensors experience limited ability in semantic-related perception tasks. Furthermore, in the case of vehicle traveling at high speed, relevant information is not handily extracted due to scan distortion3. The 2D lidar have obviously similar problems, with further limitations due the availability of a single scan channel and reduced FoV. Nevertheless, 2D lidars are usually cheaper than the 3D ones, which have mature algorithm support and been widely used in mobile robotics long enough for mapping and localization problems. Visual cameras can encode rich semantic and texture information into the image, while low robustness is experienced with lightness and illumination variances. Radar is very robustness to light and weather changes, while it lacks of range sensing accuracy. In summary, it is difficult to rely on a single sensor type for efficient perception and localization in autonomous driving, as concerned by this paper. Hence, it is important for researchers and industries to leverage the advantages of different sensors and make the multisensor system complimentary with individual ones. Table I summarizes typical advantages and disadvantages of the commonly used sensors.

Sensors Pros Cons
GNSS easy-to-use low positioning accuracy
less weather sensitivity limited by urban area
lidar high positioning accuracy high equipment cost
fast data collection high computational cost
can be used day and night ineffective during rain
camera low equipment cost low positioning accuracy
providing intuitive images affected by lighting
radar reliable detection low positioning accuracy
unaffected by the weather slow data collection
TABLE I: Pros and cons of the commonly used sensors for autonomous driving

Ii-a Hardware

Fig. 2: The sensors used and their mounting positions.

The sensor configuration of our autonomous car is illustrated in Fig. 2. Its design mainly adheres to two principles: strengthen the visual scope as much as possible, and maximize the overlapping area perceived by multiple sensors. In particular:

  • Two stereo cameras, i.e. a front-facing Bumblebee XB3 and a back-facing Bumblebee2, are mounted on the front and rear of the roof, respectively. These two cameras are both with CCD (Charge-Coupled Device) sensors in global shutter mode, and compared to rolling shutter cameras, they are more advantageous when the vehicle is driving at a high speed. In particular, every pixel in a captured image is exposed simultaneously at the same instant in time in global shutter mode, while exposures typically move as a wave from one side of the image to the other in rolling shutter mode.

  • Two Velodyne HDL-32E lidars are mounted on the front portion of the vehicle roof, side by side. Each Velodyne lidar has 32 scan channels, 360 horizontal and 40 vertical FoV, with a reported measuring range up to 100m. It is noteworthy that when using multiple Velodyne lidars in proximity to one another, as in our case, sensory data may be affected due to one sensor picking up a reflection intended for another. In order to reduce the likelihood of the lidars interfering with each other, we used its built-in phase-locking feature to control where the laser firings overlap for the data recording, and post-processed it to remove data shadows behind each lidar sensor. Details will be given in Section II-B2.

  • Two Pixelink PL-B742F industrial cameras with fisheye lens are installed in the middle of the roof, facing the lateral sides of the vehicle. The camera has CMOS (Complementary Metal-Oxide-Semiconductor) global shutter sensor that freezes the high-speed motion, while the fisheye lens allows to capture an extremely wide angle of view. This setting, on the one hand, increases the vehicle’s perception of the environment on both lateral sides that has not been well studied so far, and on the other hand, adds a semantical complement to the Velodyne lidars.

  • An ibeo LUX 4L lidar is embedded into the front bumper close to the y-axis of the car, which provides four scanning layers, a 85 (or 110 if one uses only two layers) horizontal FoV, and up to 200m measurement range. Together with a radar, they are extremely important for our system to ensure the safety of the vehicle itself as well as other objects (especially humans) in the vicinity of the front of the vehicle.

  • A Continental ARS 308 radar is mounted in a position close to the ibeo LUX lidar, which is very reliable for the detection of moving objects. While less angularly accurate than lidar, radar can work in almost every condition and even use reflection to see behind obstacles. Our framework is designed to detect and track objects in front of the car by “cross-checking” both radar and lidar data.

  • A SICK LMS100-10000 laser rangefinder (i.e. 2D lidar) facing the road is mounted on one side of the front bumper. It measures its surroundings in two-dimensional polar coordinates and provides a 270 FoV. Due to its downward tilt, the sensor is able to scan the road surface and deliver information about road markings and road boundaries. The combination use of the ibeo LUX and the SICK lidars is also recommended by the industrial community, i.e. the former for object detection (dynamics) and the latter for road understanding (statics).

  • A Magellan ProFlex 500 GNSS receiver is placed in the car with two antennas on the roof. One antenna is mounted on the z-axis perpendicular to the car rear axle for receiving satellite signals and the other is placed at the rear of the roof for synchronizing with an RTK base station. With the help of the RTK enhancement, the GPS positioning will be corrected and the positioning error will be reduced from meters-level to centimeters-level.

  • An Xsens MTi-28A53G25 IMU is also placed inside the vehicle, putting out linear acceleration, angular velocity, absolute orientation, among others.

It is worth mentioning that a trade-off we made in our sensor configuration is the side-by-side use of two Velodyne 32-layer lidars rather than adopting a single lidar or other models. The reason for this is twofold. First, in the single lidar solution, the lidar is mounted on a “tower” in the middle of the roof in order to eliminate occlusions caused by the roof, which is not an attractive option from an industrial design point of view. Second, other models such as 64-layer lidar is more expensive than two 32-layer lidars which costs more than two 16-layer lidars. We therefore use a pair of 32-layer lidars as the trade-off between sensing efficiency and hardware cost.

Regarding the reception of sensory data, the ibeo LUX lidar and the radar are connected to a customized control unit that is used for real-time vehicle handling and low-level control such as steering, acceleration and braking. This setting is very necessary, because the real-time response from these two sensors to CAN bus is extremely important for driving safety. All the lidars via a high-speed Ethernet network, the radar via RS-232, the cameras via IEEE 1394, and the GPS/IMU via USB, are connected to a DELL Precision Tower 3620 workstation. The latter is only for data collection purpose, while a dedicated embedded automation computer will be used as master computer ensuring operation of the most essential system modules such as SLAM, point cloud clustering, sensor fusion, localization, and path planning. Then a gaming laptop (with high-performance GPU) will serve as slave unit which is responsible to process computational intense and algorithmically complex jobs, especially for the visual computing. In addition, our current system is equipped with two 60Ah external car batteries that can provide us with more than one hour of autonomy.

Ii-B Software

Our software system is based entirely on ROS. For data collection, all the sensors are physically connected to the DELL workstation and all ROS nodes were running locally. This setting maximizes data synchronization at the software level (timestamped by ROS)4. The ROS-based software architecture diagram and the publish frequency of each sensor for data collection are shown in Fig. 3. It is worth pointing out that the collection is done with a CPU-only (Intel i7-7700) computer, while without any data delay. This is mainly due to the fact that we only record the raw data and leave the post-processing to offline playback. It is also worth noting that we focus on providing pioneering experience in vehicle perception purely based on ROS-1 (can be a reference for ROS-2), and let loose the data collection at the vehicle regulation level. Moreover, as we provide raw data from different devices, advanced processing such as motion compensation can be done by the end user.

Fig. 3: ROS-based software architecture diagram for data collection. The data is saved in rosbag format. Please note that, in order to facilitate the reader to reproduce the system, we indicate the ROS package name instead of the ROS node name for each sensor driver. However, the ROS master communicates actually with the node provided by the package.

Sensor Calibration

Like most of other multisensor systems, all our cameras and lidars are both intrinsically and extrinsically calibrated, while the calibration files are available at The intrinsic calibration of the monocular cameras as well as the extrinsic calibration of the stereo cameras were performed with a chessboard using ROS camera_calibration package, while the lidars are with factory intrinsic parameters. Then, all other sensors were calibrated with respect to the Velodyne lidars. The extrinsic parameters of the lidars were estimated via minimizing the voxel-wise distance of the points from different sensors by driving the car in a structured environment with several landmarks. To calibrate the extrinsic transform between the stereo camera and the Velodyne lidar, we drove the car facing the corner of a building and manually aligned two point clouds on three planes i.e. two walls and the ground. An aligned sensor data is visualized in Fig. 4. As we can see, through the calibration, points from all the lidars and the stereo cameras are aligned properly.

Fig. 4: A ROS Rviz screenshot of the collected data with calibrated sensors. The UTBM robocar is in the centre of the image with a truck in front. The red ring points come from the Velodyne, white points from the SICK, and colored dots from the ibeo LUX lidar. The point clouds in front of and behind the car are from the two Bumblebee stereo cameras.

Configuration of two Velodyne lidars

As aforementioned, the two Velodyne lidars have to be properly configured in order to work efficiently. Firstly, the phase lock feature of each sensor needs to be set to synchronize the relative rotational position of the two lidars, based on the Pulse Per Second (PPS) signal. While the latter can be obtained from the GPS receiver connected to the lidar’s interface box. In our case, i.e. the two sensors are placed on the left and right sides of the roof, the left one has its phase lock offset set to 90, while the right one is set to 270, as shown in Fig. 5.

Fig. 5: Phase offset setting of two side-by-side installed Velodyne lidars

Secondly, the Eq. 1 [5] can be used to remove any spurious data due to blockage or reflections from the opposing sensor (i.e. data shadows behind each other, see Fig. 6):


where, is the subtended angle, is the diameter of the far sensor, and is the distance between sensor centers.

Fig. 6: Data shadows behind a pair of Velodyne lidars.

Moreover, in order to avoid network congestion led by the broadcast data of the sensors, we configure each Velodyne (the same for the SICK and the ibeo LUX lidars) to transmit its packets to a specific (i.e. non-broadcast) destination IP address (in our case, the IP address of the workstation), via a unique port.

Iii Dataset

Our recording software is fully implemented into the ROS system. Data collection was carried out based on the Ubuntu 16.04 LTS (64-bit) and the ROS Kinetic. The vehicle was driven by a human and any ADAS (Advanced Driver Assistant System) functions were disabled. The data collection was performed in the downtown (for long-term data) and a suburb (for roundabout data) of Montbéliard in France. The vehicle speed was limited to 50km/h following the French traffic rules. It is conceivable that the urban scene during the day (recording time around 15h to 16h) is highly dynamic, while the evening (recording time around 21h) is relatively calm. Light and vegetation (especially street trees) are abundant in summer, while winter is generally poorly lit, with little vegetation and sometimes even covered with ice and snow. All data were recorded in rosbag files for easy sharing with the community. The data collection itineraries can be seen in Fig. 7, which were carefully selected after many trials.

Fig. 7: Data collection itineraries drawn on Google Maps. Left: for long-term data. Right: for roundabout data.

For the long-term data, we focus on the environment that is closely related to periodic changes [10, 18] such as daily, weekly and seasonal changes. We followed the same route eleven rounds at different times. The length of the data recording is about 5km each round and the route passes through the city centre, a park, a residential area, a commercial area and a bridge on the river Doubs, and includes a small and a big road loop (for loop-closure purpose). The RTK base station was placed at a fixed location on the mound - position marked by the red dot in Fig. 7(left) (sea level 357m) - in order to communicate with the GNSS receiver in the car with minimal signal occlusion. With these settings, we recorded data during the day, at night, during the week, in the summer and winter (with snow), always following the same itinerary. At the same time, we captured many new research challenges such as uphill/downhill road, shared zone, road diversion, and highly dynamic/dense traffic.

Moreover, roundabout is very common in France as well as in other European countries. This road condition is not easy to handle even for humans. The key is to accurately predict the behavior of other vehicles. To promote related research on this topic, we repeatedly recorded some data in the area near the UTBM Montbéliard campus, which contains 10 roundabouts with various sizes in the range of approximately 0.75km (see Fig. 7(right)).

Iii-a Lidar Odometry Benchmarking

As part of the dataset, we establish several baselines for lidar odometry5, which is one of the challenges provided by our dataset. We forked the implementation of the following state-of-the-art methods and experimented with our dataset:

  • loam_velodyne [23] is one of most advanced lidar odometry method and providing real-time SLAM for 3D lidar, submitted the state-of-the-art performance in KITTI benchmark [4]. The implementation is robust for both structured (urban) and unstructured (highway) environments, and a scan restoration mechanism is devised for fast-speed driving.

  • LeGO-LOAM [14] is a lightweight and ground-optimized LOAM, mainly to solve the problem that the performance of LOAM deteriorates when resources are limited and operating in noisy environments. Point cloud segmentation in LeGO-LOAM is performed to discard points that may represent unreliable features after ground separation.

As an example, Fig. 8 shows the odometry result of using loam_velodyne algorithm on a recording round. Users are encouraged to evaluate their methods, compare with the provided baselines on devices with different levels of computation capability, and submit their results to our baseline GitHub repository. However, only real-time performance is accepted, as it is critically important for the vehicle localization in autonomous driving.

Fig. 8: Evaluation example of the baseline methods.

Iii-B Long-term Autonomy

Towards an on-the-shelf autonomous driving system, long-term autonomy, including long-term vehicle localization and mapping as well as dynamic object prediction, is necessary. For this goal, we introduce the concept of “self-aware localization”, “liability-aware long-term mapping” to advance the robustness of vehicle localization in a real-life and changing environment. To be more specific, for the former, the vehicle should be able to wake up in any previously known locations [3]. While the “liability-aware long-term mapping” enables the vehicle to maintain the map in long-term with monitoring the variance of landmarks and goodness of map alignment [16]. Moreover, the proposed long-term dataset can be used to predict occupancy and presence of dynamic objects such as humans and cars. The periodical layout changes and human activities can be tracked and modelled using either frequency modelling [9] or Recurrent Neural Networks (RNNs) [16]. The predicted occupancy map and human activity patterns can ultimately facilitate the vehicle motion planning in dynamic urban environments. In this paper, we present the multiple sessions of driving data with a variance of lightness and landmarks. We propose the long-term localization and mapping as well as dynamic object prediction as open problems and encourage the researchers to investigate the potential solutions with our dedicated dataset.

Iii-C Roundabout Challenge

Roundabout is unavoidable and can be very challenging for autonomous driving. France has the largest number of roundabout in the world (about 50,000), with a considerable variety. The various roundabout data we provide aims at pursuing related research on vehicle behavior prediction, and helping decreasing auto crashes in such situation. On the one hand, one can get information about the car’s turn signal from the image, and even the steering information of the wheels. On the other hand, as we drove a full circle for each roundabout, users could have a long-term continuous data to learn and predict the trajectory of surrounding vehicles.

Iv Related Work

Over the past few years, numerous platforms and resources for autonomous driving have emerged and grabbed public attention. The AnnieWAY platform6 with its famous KITTI dataset7 [4] have always shown strong influence in the community. This dataset is the most widely-used visual perception dataset for autonomous driving, recorded with a sensing system comprising an OXTS RT 3003 GPS/IMU integrated system, a Velodyne HDL-64E 3D lidar, two Point Grey Flea 2 grayscale cameras, and two Point Grey Flea 2 color cameras. With this configuration, the instrumented vehicle is able to produce 10 lidar frames per second with 100k points per frame for lidar based localization and 3D object detection, two gray images for visual odometry and two color images for optical flow estimation, object detection, tracking and semantic understanding benchmarks.

The RobotCar8 from the University of Oxford is considered to be another powerful competitive platform. The public available dataset9 [12] is the first multi-sensor long-term on-road driving dataset. The Oxford RobotCar is equipped with a Point Grey Bumblebee XB3 stereo camera, three Point Grey Grasshopper2 fisheye camera, two SICK LMS-151 2D lidar and a SICK LD-MRS 3D lidar. Within this configuration, the three fisheye cameras cover a 360 FoV, the 2D/3D lidars and stereo cameras yield a data steam on 11fps and 16fps, respectively. This dataset is collected in a period of one year and around 1000km in total.

KAIST dataset10 [7] focuses on complex urban environments such as downtown area, apartment complexes, and under-ground parking lot, and the data collection was performed with a vehicle equipped with 13 sensors. Recently, Waymo11 [17] (formerly the Google self-driving car project) started to release part of their data recorded across a range of conditions in multiple cities in the US. There is no doubt that this automotive-grade dataset will make a significant contribution to the community.

Other datasets including Cityscapes12 [2], ApolloScape13 [6], and BBD100K14 [22], mainly focus on visual perception such as object detection, semantic segmentation, and lane/drivable area segmentation, and only visual data (i.e. images) are released. As the present paper focuses more on multisensor perception and localization, we do not give further details of these datasets here. To have a more intuitionistic view, a comparison between our dataset and the aforementioned ones is provided in Table II.

Dataset Sensor Synchronization Ground-truth Location Weather Time
Ours 2 32-layer lidar software GPS-RTK/IMU France sun, clouds, day, dusk, night,
1 4-layer lidar (ROS timestamp) for vehicle snow three seasons
1 1-layer lidar and hardware self-localization (spring, summer,
2 stereo camera (PPS for the winter)
2 fisheye camera two Velodynes)
1 radar
1 independent IMU
KITTI [4] 1 64-layer lidar software scene flow, odometry Germany clear day, autumn
2 grayscale camera and hardware object detection
2 color camera (reed contact) & tracking,
1 GPS-RTK/IMU road & lane
Oxford [12] 1 4-layer lidar software GPS-RTK/INS UK sun, clouds, day, dusk, night,
2 1-layer lidar for vehicle overcast, rain four seasons
1 stereo camera self-localization snow
3 fisheye camera
Cityscapes [2] 1 stereo camera N/A semantics Germany sun, clouds day, three seasons
France (spring, summer,
Switzerland fall)
KAIST [7] 2 16-layer lidar software SLAM algorithm South Korea clear day
2 1-layer lidar (ROS timestamp) for vehicle
2 monocular camera and hardware self-localization
1 consumer-level GPS (PPS for the
1 GPS-RTK two Velodynes,
1 fiber optics gyro an external trigger
1 independent IMU for the two
2 wheel encoder monocular cameras
1 altimeter to get stereo)
ApolloScape [6] 2 1-layer lidar unknown scene parsing, car China unknown day
6 monocular camera instance, lane
1 GPS-RTK/IMU segmentation, self
detection & tracking,
trajectory, stereo
BBD100K [22] 1 monocular camera N/A semantics, US sun, rain, day, dusk, night,
drivable areas, snow dawn
lane markings
nuScenes [1] 1 32-layer lidar software HD map-based US sun, clouds, day, night
6 monocular camera localization, Singapore rain
5 radar object detection
1 GPS-RTK & tracking
1 independent IMU
Waymo [17] 5 lidar unknown but very object detection US sun, rain day, night
5 camera well-synchronized & tracking both for
lidar & camera
Dataset Distance Data format Baseline Download License Privacy First release
Ours 63.4km rosbag (All-in-One) 3 free CC BY-NC-SA 4.0 face & plate Nov. 2018
KITTI [4] 39.2km png (camera) 3 registration CC BY-NC-SA 3.0 removal Mar. 2012
txt (GPS-RTK/IMU) under request
bin (lidar)
Oxford [12] 1010.46km png (camera) 0 registration CC BY-NC-SA 4.0 removal Oct. 2016
csv (GPS-RTK/INS) under request
bin (lidar)
Cityscapes [2] unknown png (camera) 4 registration Cityscapes License removal Feb, 2016
under request
KAIST [7] 190,989km bin (lidar), png (camera) 1 registration CC BY-NC-SA 4.0 removal Sep. 2017
csv (GPS-RTK/IMU) under request
ApolloScape [6] unknown png (lidar) 1 registration ApolloScape License removal Apr. 2018
jpg (camera) under request
BBD100K [22] unknown mp4, png (camera) 3 registration unknown unknown May 2018
nuScenes [1] 242km xml 3 registration CC BY-NC-SA 4.0 face & plate Mar. 2019
Waymo [17] unknown range image (lidar) 3 registration Waymo License face & plate Aug. 2019
jpeg (camera) removed

right-hand traffic, left-hand traffic, vertical scanning, device model undisclosed,
only including methods published with the paper, excluding community contributions.

TABLE II: A comparison of the existing datasets for autonomous driving

For a deeper analysis, KITTI provides a relative comprehensive challenges for both perception and localization, and its hardware configuration, i.e. a combination of 3D lidar and stereo cameras, is widely-used for prototyping robot cars by autonomous vehicle companies. While, there are still two limitation of KITTI dataset. First, the dataset only captured in one session and long-term variances, e.g. lightness, season, of the scene are not investigated. Second, the visual cameras have not covered the full FoV, thereby blind spots existed. Oxford dataset investigated the vision based perception and localization with variance of seasons, weather and time, however, the modern 3D lidar sensory data is not included. In this paper, we leverage the pros of the platform design in KITTI and Oxford, and eliminate the cons. That is, a combination of four lidars (including two Velodynes) and four cameras multisensor framework is proposed to engender stronger range and visual sensing.

Apart from the hardware configuration and dataset collection, there exist widely-cited open-source repositories, such as Apollo15, Autoware16, and Udacity17, which provide researchers a platform to contribute and share autonomous driving software.

V Conclusion

In this paper, we presented our autonomous driving platform with a focus on multisensor framework for efficient perception and localization. To build the framework, we integrated eleven heterogeneous sensors including various lidars and cameras, radar, GPS/IMU, in order to enhance the vehicle’s visual scope and perception capability. By exploiting the heterogeneity of different sensory data (e.g. sensor fusion), the vehicle is also expected to have a better situation awareness and ultimately improve the safety of autonomous driving for human society. Leveraging our instrumented car, a ROS-based dataset is cumulatively recorded and is publicly available to the community. This dataset is full of new research challenges and as it contains periodic changes, it is especially suitable for long-term autonomy study. We hope our efforts and on-the-shelf experience could pursue the development and help on solving related problems in autonomous driving, especially for long-term autonomy such as persistent mapping [16] and long-term prediction [18, 15], as well as online/lifelong learning [9, 21, 10, 20, 19].

Furthermore, as we take privacy very seriously and handle personal data in line with the EU’s data protection law (i.e. the General Data Protection Regulation (GDPR)), we used deep learning-based methods18 to post-process the camera-recorded images in order to blur face and license plate information. The images have been released successively from the first quarter of 2020.


  3. Motion compensation could alleviate this problem.
  4. Data synchronization at the hardware level is beyond the scope of this paper.


  1. H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan and O. Beijbom (2019) nuScenes: a multimodal dataset for autonomous driving. CoRR abs/1903.11027. External Links: 1903.11027 Cited by: TABLE II.
  2. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele (2016) The cityscapes dataset for semantic urban scene understanding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223. Cited by: TABLE II, §IV.
  3. R. Dubé, D. Dugas, E. Stumm, J. I. Nieto, R. Siegwart and C. Cadena (2017) SegMatch: segment based place recognition in 3d point clouds. In IEEE International Conference on Robotics and Automation (ICRA), pp. 5266–5272. Cited by: §III-B.
  4. A. Geiger, P. Lenz, C. Stiller and R. Urtasun (2013) Vision meets robotics: the KITTI dataset. International Journal of Robotics Research 32 (11), pp. 1231–1237. Cited by: 1st item, TABLE II, §IV.
  5. (2018) HDL-32E user manual. Velodyne. Note: 63-9113 Rev. M Cited by: §II-B2.
  6. X. Huang, P. Wang, X. Cheng, D. Zhou, Q. Geng and R. Yang (2018) The apolloscape open dataset for autonomous driving and its application. CoRR abs/1803.06184. External Links: 1803.06184 Cited by: TABLE II, §IV.
  7. J. Jeong, Y. Cho, Y. Shin, H. Roh and A. Kim (2019) Complex urban dataset with multi-level sensors from highly diverse urban environments. The International Journal of Robotics Research. Cited by: TABLE II, §IV.
  8. T. Krajník, J. P. Fentanes, G. Cielniak, C. Dondrup and T. Duckett (2014) Spectral analysis for long-term robotic mapping. In IEEE International Conference on Robotics and Automation (ICRA), pp. 3706–3711. Cited by: §I.
  9. T. Krajník, J. P. Fentanes, J. M. Santos and T. Duckett (2017) FreMEn: frequency map enhancement for long-term mobile robot autonomy in changing environments. IEEE Transactions on Robotics 33 (4), pp. 964–977. Cited by: §I, §III-B, §V.
  10. T. Krajnik, T. Vintr, S. Molina, J. P. Fentanes, G. Cielniak, O. M. Mozos, G. Broughton and T. Duckett (2019) Warped hypertime representations for long-term autonomy of mobile robots. IEEE Robotics and Automation Letters 4 (4), pp. 3310–3317. Cited by: §III, §V.
  11. L. Kunze, N. Hawes, T. Duckett, M. Hanheide and T. Krajnik (2018) Artificial intelligence for long-term robot autonomy: a survey. IEEE Robotics and Automation Letters 3, pp. 4023–4030. Cited by: §I.
  12. W. Maddern, G. Pascoe, C. Linegar and P. Newman (2017) 1 year, 1000 km: the oxford robotcar dataset. The International Journal of Robotics Research 36 (1), pp. 3–15. Cited by: TABLE II, §IV.
  13. M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler and A. Y. Ng (2009) ROS: an open-source robot operating system. In ICRA Workshop on Open Source Software, Cited by: §I.
  14. T. Shan and B. Englot (2018) LeGO-LOAM: lightweight and ground-optimized lidar odometry and mapping on variable terrain. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4758–4765. Cited by: 2nd item.
  15. L. Sun, Z. Yan, S. M. Mellado, M. Hanheide and T. Duckett (2018-05) 3DOF pedestrian trajectory prediction learned from long-term autonomous mobile robot deployment data. In In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia. Cited by: §V.
  16. L. Sun, Z. Yan, A. Zaganidis, C. Zhao and T. Duckett (2018) Recurrent-octomap: learning state-based map refinement for long-term semantic mapping with 3d-lidar data. IEEE Robotics and Automation Letters 3 (4), pp. 3749–3756. Cited by: §I, §III-B, §V.
  17. P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Timofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. Chen and D. Anguelov (2019) Scalability in perception for autonomous driving: waymo open dataset. CoRR abs/1912.04838. External Links: 1912.04838 Cited by: TABLE II, §IV.
  18. T. Vintr, Z. Yan, T. Duckett and T. Krajnik (2019) Spatio-temporal representation for long-term anticipation of human presence in service robotics. In IEEE International Conference on Robotics and Automation (ICRA), pp. 2620–2626. Cited by: §III, §V.
  19. Z. Yan, T. Duckett and N. Bellotto (2017-09) Online learning for human classification in 3D LiDAR-based tracking. In In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, pp. 864–871. Cited by: §V.
  20. Z. Yan, T. Duckett and N. Bellotto (2020) Online learning for 3d lidar-based human detection: experimental analysis of point cloud clustering and classification methods. Autonomous Robots 44 (2), pp. 147–164. Cited by: §V.
  21. Z. Yan, L. Sun, T. Duckett and N. Bellotto (2018-10) Multisensor online transfer learning for 3d lidar-based human detection with a mobile robot. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain. Cited by: §I, §V.
  22. F. Yu, W. Xian, Y. Chen, F. Liu, M. Liao, V. Madhavan and T. Darrell (2018) BDD100K: A diverse driving video database with scalable annotation tooling. CoRR abs/1805.04687. External Links: 1805.04687 Cited by: TABLE II, §IV.
  23. J. Zhang and S. Singh (2014) LOAM: lidar odometry and mapping in real-time.. In Robotics: Science and Systems, Vol. 2, pp. 9. Cited by: 1st item.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description