An Effective MultiCue Positioning System for Agricultural Robotics
Abstract
The selflocalization capability is a crucial component for Unmanned Ground Vehicles (UGV) in farming applications. Approaches based solely on visual information or on lowcost GPS are easily prone to fail in such scenarios. In this paper, we present a robust and accurate 3D global pose estimation framework, designed to take full advantage of heterogeneous sensory data. By modeling the pose estimation problem as a pose graph optimization, our approach simultaneously mitigates the cumulative drift introduced by motion estimation systems (wheel odometry, visual odometry, …), and the noise introduced by the raw GPS readings. Along with a suitable motion model, our system also integrates two additional types of constraints: (i) a Digital Elevation Model (DEM) and (ii) a Markov Random Field (MRF) assumption. We demonstrate how using these additional cues substantially reduces the error along the altitude axis, and this benefit spreads to the other components of the state. We report exhaustive experiments combining several sensor setups, showing improvements from 37% to 76% of the localization accuracy with respect to using only the GPS. We also show that our approach provides accurate results even if the GPS temporarily change positioning mode. We released our C++ opensource implementation and two challenging datasets with their relative ground truth.
ParameterParameters \SetKwInOutFunctionFunctions
1 Introduction
It is commonly believed that the exploitation of autonomous robots in agriculture represents one of the applications with the greatest impact on food security, sustainability, reduction of chemical treatments, and minimization of the human effort. In this context, an accurate global pose estimation system is an essential component for an effective farming robot in order to successfully accomplish several tasks: (i) navigation and path planning; (ii) autonomous ground intervention, e.g, selective spraying or mechanical removal of accurately detected weeds; (iii) acquisition of relevant semantic information, such as weed distribution and positions, fertilizer demand, crop health and global field status. However, selflocalization inside an agricultural environment is a complex task: the scene is rather homogeneous, visually repetitive and often poor of distinguishable reference points. For this reason, conventional landmark based localization approaches can easily fail. Currently, most systems rely on highend RealTime Kinematic Global Positioning Systems
In this paper we present a robust and accurate 3D global pose estimation system for UGVs (Unmanned Ground Vehicles) designed to address the specific challenges of an agricultural environment. Our system effectively fuses several heterogeneous cues extracted from lowcost, consumer grade sensors, by leveraging the strengths of each sensor and the specific characteristics of the agricultural context.
We cast the global localization problem as a pose graph optimization problem (Sec. 2): the constraints between consecutive nodes are represented by motion estimations provided by the UGV wheel odometry, local pointcloud registration, and a visual odometry (VO) frontend that provides a full 6D egomotion estimation with small cumulative drift

An altitude prior, provided by a Digital Elevation Model (DEM);

A smoothness constraint for the altitude of adjacent nodes
^{4} .
Both the newly introduced constraints are justified by the assumption that in an agricultural field the altitude varies slowly, i.e. the terrain can be approximated by piecewise smooth surfaces. The smoothness constraints exploit the fact that a farming robot traverses the field by following the crop rows, hence, by using the Markov assumption, the built pose graph can be arranged as a Markov Random Field. The motion of the UGV is finally constrained using an Ackermann motion model extended to the nonplanar motion case. The integration of such constraints not only improves the accuracy of the altitude estimation, but it also positively affects the estimate of the remaining state components, i.e. and (see Sec. 4).
The optimization problem (Sec. 3) is then iteratively solved by exploiting a graph based optimization framework [3] in a slidingwindow fashion (Sec. 3.3), i.e., optimizing the subgraphs associated to the most recent sensor readings. The slidingwindow optimization allows to obtain online global localization results that approximate the results achievable by an offline optimization over the whole dataset.
In order to validate our approach (Sec. 4), we used and make publicly available with this paper two novel challenging datasets acquired using a Bosch BoniRob UGV (Fig. 1, left) equipped with, among several others calibrated sensors, two types of lowcost GNSSs: a Precise Point Positioning (PPP) GPS and a consumer grade RTKGPS.
We report exhaustive experiments with several sensors setups, showing remarkable results: the global localization accuracy has been improved up to percentages of 37% and 76% compared with the raw localization obtained by using only the raw RTKGPS and PPPGPS readings, respectively (e.g., Fig. 1). We also show that our approach enables to accurately localize the UGV even if the GPS changes positioning mode for some time (e.g., from float to DGPS mode).
All the datasets and an opensource C++ implementation of the proposed system are freely available for download at:
http://www.dis.uniroma1.it/~labrococo/fds .
1.1 Related Work
The problem of global pose estimation for UGVs has been intensively studied in the past, especially with regard to the localization of self driving vehicles and outdoor autonomous robots moving in an urban scenario. All these systems rely on GPS readings, often coupled with position priors estimated, e.g., from IMUs [4], cameras [5, 6, 7, 8, 9], or LIDARs [10]. Moreover, a given map can be used to constrain the robot motion (e.g. 2D road maps [11] or Digital Elevation Models [12]). The sensors fusion is usually based on parametric [11] or discrete [10] filtering, pose graph optimization [7, 8], setmembership positioning [12], or hybrid topological/filtering [5].
Localization in an agricultural setting, in some respects, is even more challenging: most of the approaches presented above can’t be directly applied to such scenario. The environment is poorly or no structured, with a very strong sensor aliasing. Actually, most of the available farming robots currently mainly rely on highend RTKGPSs to localize themselves on the field [13, 14, 1, 2] with high accuracy. Some interesting attempts to remove the need for an almost perfect GPS use vision based crop rows detection and tracking during navigation [15].
1.2 Contributions
In this paper, we take advantage of several contributions to provide a robust, unified multicue positioning framework for agricultural applications that aims to achieve high level accuracy with low cost GPSs. We locally rely on visual odometry as in [5], depth information as in [10] and inertial information as in [4]. We exploit maps priors as in [12] and defined motion priors as in [15]. We infer the full 6D position of the robot solving an online graph based problem as in [7], proposing a Markov Random Field assumption to constrain the pose graph. We evaluate our system with exhaustive experiments that highlight the contribution of each employed cue. We also provide an opensource implementation of our code and two challenging datasets acquired with the Bosch BoniRob farm robot with relative ground truth.
2 MultiCue Pose Graph
The challenges that must be addressed to design a robust and accurate global pose estimation framework for farming applications are twofold: (i) the agricultural environment appearance, being usually homogeneous and visually repetitive; (ii) The high number of cues that have to be fused together. In this section we describe how we formulate a robot pose estimation procedure able to face both these issues.
The proposed system handles the global pose estimation problem as a pose graph optimization problem. A pose graph is a special case of factor graph
We define as the vector of graph nodes that represents the robot poses at discrete points in time, where each is represented by the full 3D pose in terms of a translation vector and, using the axisangle representation, an orientation vector , both in . This pose is defined with respect to a global reference centered in
From the poses of two nodes and , it is possible to compute the expected relative motion measurement and the expected global measurement (see Fig. 2). We formulate the errors between those quantities as:
(1) 
Thus, for a general sensor providing a relative information, we can characterize an edge (i.e., a binary factor ) by the error and the information matrix of the measurement, as described in [16]. In other words, an edge represents the relative pose constraint between two nodes (Fig. 2).
In order to take into account also global pose information, we use unary constraints, namely a measurement that constraints a single node. Hence, for a general sensor providing an absolute information, we define as the prior edge (i.e., an unary factor) induced by the sensor on node . Fig. 3 depicts a portion of a pose graph highlighting both unary and binary edges. Each edge acts as an directed spring with elasticity inversely proportional to the relative information matrix associated with the measurement that generates the link.
Our pose graph is built by adding an edge for each sensor reading, for both relative (e.g., wheel odometry readings) and global (e.g., GPS readings) information. In addition, we propose to integrate other prior information that exploit both the specific target environment and the assumptions we made. In the following, we report the full list of edges exploited in this work divided between local (relative) and global measurements (we report in brackets the acronyms used in Fig. 3 and described in the following of this section):

Local measurements:

Wheel odometry measurements (WO)

Visual odometry estimations (VO)

Elevation constraints between adjacent nodes (MRF)

Ackermann motion model (AMM)

Pointclouds local registration (LID)


Global measurements:

GPS readings (GPS)

Digital Elevation Model data (DEM)

IMU readings (IMU)

We define as the relative constraint induced by a visual odometry algorithm, as the relative constraint induced by the wheel odometry, and as the relative constraint obtained by aligning the local pointclouds perceived by the 3D LIDAR sensor.
Oftentimes, GPS and visual odometry provide a poor estimates of the robot position along the axis (i.e, the axis that represents its elevation). In the GPS case, this low accuracy is mainly due to the Dilution of Precision, multipath or atmospheric disturbances, while in the visual odometry this is due to the 3D locations of the tracked points. In particular, in a typical agricultural scenario most of the visual features belong to the ground plane. Hence, the small displacement of the features along the zaxis may cause a considerable drift.
On the other hand, agricultural fields usually present locally flat ground levels and, moreover, a farming robot usually traverses the field by following the crop rows. Driven by these observations, we enforce the local ground smoothness assumption by introducing an additional type of local constraints that penalizes the distance along the coordinate between adjacent nodes. Therefore, the built pose graph can be augmented by a 4connected Markov Random Field (MRF) [17]: each node is conditionally connected with the previous and the next node in the current crop row, and with the nearest nodes that belong to the previous and next crop rows, respectively. We refer to this constraint as (e.g., the set in Fig. 3).
We then add a further type of local constraint based on a standard motion model, namely the Ackermann steering model. Actually, it assumes that the robot is moving on an ideal plane, that is not our case. In this work, we exploit this model by assuming local planar motion between temporal adjacent nodes. Such a motion plane is updated with the attitude estimation for the subsequent node. We integrate this constraint by means of a new type of edge, namely .
The aforementioned local constraints are intrinsically affected by a small cumulative drift: to overcome this problem, we integrate driftfree global measurements in the graph as position prior information. In particular, we define a GPS prior and an IMU prior with the associated information matrices, respectively and .
Finally, we introduce an additional global measurement by means of an altitude prior, provided by a Digital Elevation Model. A DEM is a special type of DTM (Digital Terrain Model) that represents the elevation of the terrain at some location, by means of a regularly spaced grid of elevation points [18]. The DEM maps a 2D coordinate to an absolute elevation. Since we assume that the altitude varies slowly, we can use the current position estimate (i.e., the and components) to query the DEM for a reliable altitude estimation , with associated information matrix .
The cost function is then assembled as follows:
(2) 
where and represent respectively the set of binary and unary constraints defined above (see Fig. 3), and stands for the 4connected neighborhood of the node .
3 Pose Graph Optimization
In this section, we focus on the solution of the cost function reported in Eq. 2, describing the errors computation, the employed weighting factors assignment procedure and the online and offline versions of the optimization. We finally report some key implementation insights that let the framework reach the desired level of accuracy.
3.1 Error Computation
For each measurement , given the current graph configuration, we need to provide a in order to compute errors in Eq. 2 that represents the expected measurement given a configuration of the nodes which are involved in the constraint. Usually, for a binary constraint, this prediction is the relative transformation between the nodes and , while for unary constraint is just the full state or a subset of its components. We define as a general homogeneous transformation matrix related to the full state of the node (e.g., the homogeneous rigid body transformation generated from and ) and as a generic mapping function from to a vector, we can express and as:
(3a)  
(3b) 
In this work not all the constraints belong to : indeed, most of used sensors (e.g., WO, IMU, etc…) can only observe a portion of the full state encoded in . Therefore, in the following, we will show how we obtain the expected for each involved cue (for some components, we omit the subscripts and by using the relative translations and rotations between adjacent nodes):
VO and LID: these frontends provide the full 6D robot motion. Hence we build and by computing the relative transformation between the two connected nodes as in Eq. 3a;
WO: the robot odometry provides only the planar motion by means of a rototranslation . Therefore we build as , where the subscript used after specifies that the map to the vector involves only such components;
MRF and DEM: both measurements constraint only the altitude of the robot, we obtain the estimated measurements as:
(4a)  
(4b) 
GPS: this sensor only provides the robot position:
(5) 
IMU: from this measurement we actually exploit only the roll and pitch angles, being the rotation around the axis provided by the IMU affected by a cumulative drift. Therefore, we obtain ;
AMM: we formulate such a constraint by a composition of two transformation matrices. The first one encodes a rototranslation of the robot around the so called Instantaneous Center of Rotation (ICR). We follow the same formulation presented in [19]:
(6) 
where is the norm of the translation along and . Additionally, we add a further rotation along those two axis, taking also into account the ground slope by rotating the ideal plane on which the vehicle steers by following the Ackermann motion model:
(7) 
Hence, we obtain as .
3.2 Dynamic Weights Assignment
The impact of each constraint in the final cost function (Eq. 2) is weighted by its relative information matrix. As a consequence, such information matrices play a crucial role in weighting the measurements, i.e. giving much reliability to a noisy sensor can lead to errors in the optimization phase. We tackle this problem by dynamically assigning the information matrix for each component as follows:
WO: we use as information matrix the inverse of the covariance matrix of the robot odometry scaled by the magnitude of the distance and rotation traveled between the nodes and , as explained in [20];
VO: we use the inverse of the covariance matrix provided as output by the visual odometry frontend, weighting the rotational and translational submatrices ( and ) with two scalars and , experimentally tuned;
MRF: we set the information matrix . The weight is proportional to the distance in the plane between the two nodes, while has been experimentally tuned;
GPS: we use as information matrix the inverse of the covariance matrix provided as output by the GPS sensor;
AMM: we use as information matrix an identity matrix scaled by the magnitude of the traveled distance between the nodes and , similarly to the wheel odometry constraint. This allows to model the reliability of such a constraint as inversely proportional to the traveled distance;
IMU: we use as information matrix the inverse of the covariance matrix provided by the IMU frontend;
DEM: we set the information matrix , where is empirically tuned;
LID: From the full state provided by the Pointcloud registration frontend, in this work we exploit only the components , , and (i.e., roll, pitch and elevation). This is mainly due to the lack of geometric structure in a farming scenario that can’t provided reliable constraints along the other components. Hence, we set information matrix as , where all the parameters are empirically tuned.
3.3 SlidingWindow Optimization
A reoptimization of the whole pose graph presented above, every time a new node is added, cannot guarantee the realtime performances required for online field operations, especially when the graph contains a large amount of nodes and constraints. We solve this issue by employing a slidingwindow approach, namely performing the optimization procedure only on a subgraph that includes a sequence of recent nodes. More specifically, each time a new node associated with the most recent sensor readings is added to the graph, the subgraph is updated by adding the new node and removing its oldest one, in a slidingwindow fashion. The optimization process is performed only on the current subgraph, while older nodes maintain the state assigned during the last optimization where they were involved. In order to preserve the MRF constraints, the size of the subgraph is automatically computed so that any adjacent nodes in the previous row are included (see Fig. 4).A global optimization of the whole pose graph is then carried out offline using as initial guess the node states assigned online using the slidingwindow approach.
3.4 Implementation Details
Temporal Synchronization: In the previous sections we tacitly assumed that measurements associated with a node (e.g., WO and GPS readings) share the same time stamp. Usually, in realistic contexts, this might not happen since measurements arrive at different time. We solve this issue by interpolating the wheel odometry motion estimations, being the sensor with the highest framerate, to the time stamp of the other sensors.
Visual Odometry Failures: Another central challenge is introduced by the VO failures. We mitigate this problem by introducing a consistency check that exploits the local reliability of the wheel odometry: when the difference between WO and VO is grater than a given threshold, we assume a failure of the latter. In this case, in the optimization, we decrease the influence of the VO by scaling down its information matrix.
Pointcloud registration: Pointclouds acquired by a 3D LIDAR are typically too sparse to perform a robust alignment: thus, we accumulate different LIDAR readings into a single pointcloud by using the motion estimations provided by the visual odometry. The pointclouds registration is finally performed using the Iterative Closest Point (ICP) algorithm.
Graph Optimization: We perform both the online and offline pose graph optimization (Sec. 3.3) using the LevenbergMarquardt algorithm implemented in the g2o graph optimization framework [3].
4 Experiments
In order to analyze the performance of our system, we collected two datasets
Both datasets include data from the same set of sensors: (i) wheel odometry; (ii) VISensor device (stereo camera + IMU) [21]; (iii) Velodyne VLP16 3D LIDAR; (iv) a low cost Ublox RTKGPS; (v) an Ublox Precise Point Positioning (PPP) GPS. We acquire the ground truth reference 3D location (, and ) by using a LEICA laser tracker. Both datasets have been acquired by using the Bosch BoniRob farm robot (Fig. 1, left) on a field in Eschikon, Switzerland (Fig. 1, right). In addition to these two datasets, we have created a third dataset, namely Dataset C, where we simulate a sudden RTKGPS signal loss, e.g. due to a communication loss between the base and the rover stations. In particular, we simulated the accuracy losses by randomly switching for some time to the PPPGPS readings.
In the following, we report the quantitative results by using the following statistics build upon the localization errors with respect to the ground truth reference: Root Mean Square Error ( in the tables), maximum and mean absolute error ( and ), and mean absolute error along each component (, and ).
4.1 Visual Odometry Evaluation
DatasetA  DatasetB  

ORBSLAM2 [22]  0.116  0.091  0.171  0.139 
SPTAM [23]  0.085  0.061  0.168  0.152 
DSO [24] (*)  0.071  0.068  0.124  0.109 
SVO2 [25] (*)  0.129  0.104  0.229  0.205 
We selected the VO subsystem on the basis of an exhaustive performance evaluation, over our datasets, of four opensource implementation of modern visual odometry frontends ([22, 23, 24, 25]). In Tab. 1 we report the results of the evaluation, where we compensated the cumulative drift of monocular systems by estimating the scale from the wheel odometry frontend. Unlike the subsequent experiments, the and errors refer here to the relative translational error, computed for each meter of trajectory. The best performing system was DSO [24], the one we then used in all our experiments.
4.2 Dataset A and Dataset B
DatasetA  DatasetB  

GPS 
WO 
VO 
IMU 
AMM 
ELEV 
LIDAR 
MRF 
SW 

PPP  0.349  0.582  1.577  2.959  1.710  0.306  0.501  1.484  2.875  1.621  
✓  0.311  0.520  1.537  2.954  1.630  0.246  0.416  1.424  2.829  1.504  
✓  ✓  0.242  0.421  0.682  1.558  0.978  0.228  0.368  1.321  2.453  1.289  
✓  ✓  ✓  0.236  0.421  0.651  1.401  0.865  0.225  0.362  1.289  2.569  1.240  
✓  ✓  ✓  0.244  0.413  0.523  1.399  0.728  0.224  0.365  1.022  2.380  1.120  
✓  ✓  ✓  ✓  0.223  0.414  0.558  1.369  0.719  0.196  0.342  0.870  2.033  0.943  
✓  ✓  ✓  ✓  0.228  0.381  0.528  1.276  0.724  0.232  0.395  0.678  1.649  0.818  
✓  ✓  ✓  ✓  0.235  0.353  0.517  1.284  0.749  0.228  0.365  0.642  1.432  0.759  
✓  ✓  ✓  ✓  ✓  0.224  0.369  0.454  1.138  0.616  0.218  0.358  0.621  1.602  0.732  
✓  ✓  ✓  ✓  ✓  0.235  0.359  0.441  1.113  0.566  0.198  0.361  0.478  1.153  0.657  
✓  ✓  ✓  ✓  ✓  ✓  0.235  0.348  0.308  0.935  0.424  0.196  0.353  0.467  1.198  0.617  
✓  ✓  ✓  ✓  ✓  ✓  0.205  0.328  0.298  0.913  0.417  0.181  0.342  0.363  1.158  0.476  
✓  ✓  ✓  ✓  ✓  ✓  ✓  0.199  0.330  0.296  0.865  0.405  0.172  0.337  0.313  1.149  0.455  
✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  0.257  0.413  0.345  0.987  0.551  0.292  0.427  0.457  1.301  0.622  
RTK  0.059  0.051  0.121  0.431  0.128  0.054  0.062  0.091  0.322  0.122  
✓  0.053  0.042  0.105  0.431  0.125  0.049  0.058  0.086  0.321  0.119  
✓  ✓  0.053  0.042  0.055  0.281  0.087  0.047  0.048  0.063  0.195  0.092  
✓  ✓  ✓  0.048  0.048  0.061  0.298  0.091  0.045  0.046  0.064  0.203  0.091  
✓  ✓  ✓  0.046  0.047  0.061  0.271  0.090  0.045  0.045  0.064  0.200  0.090  
✓  ✓  ✓  ✓  0.046  0.047  0.061  0.284  0.089  0.045  0.045  0.061  0.190  0.089  
✓  ✓  ✓  ✓  0.046  0.049  0.056  0.251  0.088  0.045  0.046  0.039  0.162  0.075  
✓  ✓  ✓  ✓  ✓  0.046  0.049  0.034  0.253  0.075  0.045  0.046  0.034  0.154  0.073  
✓  ✓  ✓  ✓  ✓  ✓  ✓  0.051  0.050  0.069  0.313  0.098  0.048  0.047  0.065  0.223  0.097  
✓  ✓  ✓  ✓  ✓  ✓  0.047  0.048  0.034  0.258  0.075  0.045  0.046  0.033  0.154  0.072  
✓  ✓  ✓  ✓  ✓  ✓  ✓  0.052  0.055  0.041  0.272  0.084  0.050  0.051  0.037  0.169  0.083 
This set of experiments is designed to demonstrate the effectiveness of the proposed method, and the benefits introduced by each cue within the final estimation. We report in Tab. 2 the results obtained by using different sensor combinations and optimization procedures over Dataset A and Dataset B. The table is split according to the type of GPS sensor used, the best results are highlighted in bold. A first remarkable result is the decreasing error trend, almost monotonic: the more sensors we introduced in the optimization process, the smaller the resulting and errors are. This behavior occurs in both Dataset A and Dataset B, and proves how the proposed method properly handles all the available sources of information. Another important outcome is the relative improvement obtained between the worst and the best set of cues, which is around the 37% for RTK case, and 76% for the PPP case. A noteworthy decreasing also happens to the error statistic, respectively, 40% and 70%: this fact brings a considerable benefit to agricultural applications, where spikes in the location error might lead to crushing crops. For the best performing sensor setup, we also report the results obtained by using the slidingwindow, online pose graph optimization procedure (Sec. 3.3): also in this case the relative improvement is remarkable (32% and 67%, respectively) enabling a safer and more accurate realtime UGV navigation.
Fig. 5 (Top) depicts a qualitative top view comparison between the raw PPPGPS trajectory (Topleft) and the trajectory (Topright) obtained after the pose graph optimization using the best sensors configuration (see Table 2) in Dataset A. The error plots (Bottom) show how the introduction of additional sensors and constraints allows to significantly improve the pose estimation. Similar qualitative results for Dataset B are reported in Fig. 1. Fig. 6 reports the error plot of the RTKGPS trajectory versus the trajectory estimated by our system with the best sensors ensemble.
For both GPSs, the maximal error reduction happens when all the available cues are used within the optimization procedure except for the low cost RTKGPS case, where the ELEV constraints worsens the error along the axis. This phenomenon happens since a RTKGPS provides, for most of the time, an accurate estimate along all the translational axes: in this case, a DEM model does not provide any useful information to the optimization procedure. Another important aspect to highlight is how the addition of constraints that only act on a part of the state, also positively affect the remaining state components.
This especially happens above all with PPPGPS, that brings a large amount of uncertainty within the optimization. In this case, adding the attitude information (e.g. IMU measurements) involves also a small error reduction along the translational state components. Similarly, as shown in Tab. 2, the addition of the LIDAR and the ELEV constraints leads to an error reduction also along the and axis. As a further qualitative evaluation, in Fig. 7, we report the pointcloud obtained by rendering LIDAR scans at each estimated position with and without the IMU contribution within the optimization procedure: the attitude estimation greatly benefits from this contribution.
The runtime of our system are reported in Tab. 3, for both the offline and online, slidingwindow cases.
SW  

Dataset A  786  8259  24  13.493  
✓  98  763  4  0.0989  
Dataset B  754  8032  22  12.993  
✓  104  851  5  0.1131 
4.3 Dataset C
DatasetC  

GPS 
WO 
VO 
IMU 
AMM 
ELEV 
LIDAR 
MRF 

RTK+PPP  1.313  0.647  
✓  1.291  0.613  
✓  ✓  1.268  0.567  
✓  ✓  ✓  ✓  1.180  0.435  
✓  ✓  ✓  ✓  0.892  0.359  
✓  ✓  ✓  ✓  ✓  0.556  0.227  
✓  ✓  ✓  ✓  0.649  0.201  
✓  ✓  ✓  ✓  ✓  0.505  0.200  
✓  ✓  ✓  ✓  ✓  ✓  ✓  0.537  0.182  
✓  ✓  ✓  ✓  ✓  ✓  0.411  0.166 
The second set of experiments is designed to prove the robustness of the proposed system against sudden losses in the GPS sensor accuracy.
In Tab. 4 we report the quantitative results of our system over Dataset C by means of and errors. Even in presence of a RTKGPS signal loss that lasts for more than one crop row (see Fig. 8 (Left)), the best sensors setup leads to a remarkable of 0.166 m and a relative improvement around the 72%. Moreover, also in Dataset C the and error statistics follow the same decreasing trend shown in Tab. 2.
Fig. 8 (Left) depicts a top view of the optimized trajectory obtained using the best sensors configuration (see Tab. 4), highlighting the part of the path where the accuracy loss occurs. In Fig. 8 (Right) we compare the absolute error trajectories for the best sensors configuration against the error trajectory obtained by using only the GPS measurements: the part where the signal loss occurs turns into a higher error.
Another interesting observation regards the inconsistent effects related to the use of the ELEV constraint. As shown in Tab. 4, in some cases it allows to decrease the overall error, while in others it considerably worsens the estimate. The latter mostly happens when the pose estimation is fairly reliable (e.g. when most of the available constraints are already in use). Indeed, as explained in section 4.2, in such cases the ELEV constraint does not provide any additional information to the optimization procedure, while with a less accurate GPS (e.g., the PPPGPS used in our experiments) its employment is certainly desirable.
5 Conclusion
In this paper, we present an effective global pose estimation system for agricultural applications that leverages in a reliable and efficient way an ensemble of cues. We take advantage from the specificity of the scenario by introducing new constraints exploited inside a posegraph realization that aims to enhance the strengths of each integrated information. We report a comprehensive set of experiments that support our claims: the provided localization accuracy is remarkable, the accuracy improvement well scale with the number of integrated cues, the proposed system is able to work effectively with different types of GPS, even in presence of signal degradations. The opensource implementation of our system along with the acquired datasets are made publicly available with this paper.
Acknowledgment
We are grateful to Wolfram Burgard for providing us with the Bosch BoniRob, and to Raghav Khanna and Frank Liebisch to help us in acquiring the datasets.
Footnotes
 Highend RTKGPSs typically provide 12 cm horizontal accuracy and 35 cm vertical accuracy.
 In this paper, we use GPS as a synonym of the more general acronym GNSS (Global Navigation Satellite System) since almost all GNSSs use at least the GPS system, included the two GNSSs used in our experiments.
 In visual odometry, openloop systems, the cumulative drift is unavoidable.
 We use here the term adjacent to denote nodes that are temporally or spatially close.
 A factor graph is a bipartite graph where nodes encode either variables or measurements, namely the factors.
 We transform each global measurement (e.g., GPS measurements) in the reference frame .
 Publicly available at:
http://www.dis.uniroma1.it/~labrococo/fds
References
 U. Weiss and P. Biber, “Plant detection and mapping for agricultural robots using a 3D lidar sensor,” Robotics and autonomous systems, vol. 59, no. 5, pp. 265–273, 2011.
 M. Nørremark, H. W. Griepentrog, J. Nielsen, and H. T. Søgaard, “The development and assessment of the accuracy of an autonomous gpsbased system for intrarow mechanical weed control in row crops,” Biosystems Engineering, vol. 101, no. 4, pp. 396–410, 2008.
 R. Kümmerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, “g2o: A general framework for graph optimization,” in Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA). IEEE, 2011.
 J. Farrell, Aided Navigation: GPS with High Rate Sensors. McGrawHill, Inc., 2008.
 D. Schleicher, L. M. Bergasa, M. Ocana, R. Barea, and M. E. Lopez, “Realtime hierarchical outdoor SLAM based on stereovision and gps fusion,” IEEE Transactions on Intelligent Transportation Systems, vol. 10, no. 3, pp. 440 – 452, 2009.
 I. Parra, M. Ãngel Sotelo, D. F. Llorca, C. FernÃ¡ndez, A. Llamazares, N. HernÃ¡ndez, and I. GarcÃa, “Visual odometry and map fusion for GPS navigation assistance,” in Proc. of the IEEE International Symposium on Industrial Electronics (ISIE). IEEE, 2011.
 J. Rehder, K. Gupta, S. T. Nuske, and S. Singh, “Global pose estimation with limited GPS and long range visual odometry,” in Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2012.
 K. Vishal, C. V. Jawahar, and V. Chari, “Accurate localization by fusing images and gps signals,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2015.
 M. Schreiber, H. KÃ¶nigshof, A.M. Hellmund, and C. Stiller, “Vehicle localization with tightly coupled GNSS and visual odometry,” in Proc. of the IEEE Intelligent Vehicles Symposium (IV), 2016.
 R. Kümmerle, M. Ruhnke, B. Steder, C. Stachniss, and W. Burgard, “A navigation system for robots operating in crowded urban environments,” in Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2013.
 C. Fouque and P. Bonnifait, “On the use of 2D navigable maps for enhancing ground vehicle localization,” in Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2009.
 V. Drevelle and P. Bonnifait, “A setmembership approach for high integrity heightaided satellite positioning,” GPS Solutions, vol. 15, no. 4, pp. 357–368, October 2011.
 A. Stoll and H. D. Kutzbach, “Guidance of a forage harvester with GPS,” Precision Agriculture, vol. 2, no. 3, pp. 281–291, 2000.
 B. Thuilot, C. Cariou, L. Cordesses, and P. Martinet, “Automatic guidance of a farm tractor along curved paths, using a unique CPDGPS,” in Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS). IEEE, 2001.
 A. English, P. Ross, D. Ball, and P. Corke, “Vision based guidance for robot navigation in agriculture,” in Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2014.
 G. Grisetti, R. Kuemmerle, C. Stachniss, and W. Burgard, “A tutorial on graphbased SLAM,” Intelligent Transportation Systems Magazine, IEEE, vol. 2, no. 4, pp. 31–43, 2010.
 A. Blake, P. Kohli, and C. Rother, Markov Random Fields for Vision and Image Processing. MIT Press, 2011.
 C. Hirt, Digital Terrain Models. Springer International Publishing, 2014, pp. 1–6.
 A. Pretto, E. Menegatti, and E. Pagello, “Omnidirectional dense largescale mapping and navigation based on meaningful triangulation,” in Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2011.
 S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics (Intelligent Robotics and Autonomous Agents). The MIT Press, 2005.
 J. Nikolic, J. Rehder, M. Burri, P. Gohl, S. Leutenegger, P. T. Furgale, and R. Siegwart, “A synchronized visualinertial sensor system with fpga preprocessing for accurate realtime slam,” in Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2014.
 R. MurArtal and J. D. Tardós, “ORBSLAM2: an opensource SLAM system for monocular, stereo and RGBD cameras,” arXiv preprint arXiv:1610.06475, 2016.
 T. Pire, T. Fischer, J. Civera, P. De Cristóforis, and J. Jacobo berlles, “Stereo Parallel Tracking and Mapping for robot localization,” in Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 2015.
 J. Engel, V. Koltun, and D. Cremers, “Direct sparse odometry,” in arXiv:1607.02565, July 2016.
 C. Forster, Z. Zhang, M. Gassner, M. Werlberger, and D. Scaramuzza, “Svo: Semidirect visual odometry for monocular and multicamera systems,” IEEE Transactions on Robotics, vol. 33, no. 2, pp. 249–265, 2017.