
B. Contribution
Our work is motivated by the broad literature in lidar and
IMU based state estimation. Our proposed contributions are:
•
Multi-lidar odometry using a single fused local submap
and handling lidar dropout scenarios.
•
MIMU fusion which compensates for the Coriolis effect
and accounts for potential signal loss.
•
A factor graph framework to jointly fuse multiple lidars,
MIMUs and GNSS signals for robust state estimation
in a global frame.
•
Experimental results and verification using data collected
from Scania vehicles with the sensor setup shown in
Fig. 2, with FoV schematics similar to Fig. 1.
II. RELATED WORK
There have been multiple studies of lidar and IMU-based
SLAM after the seminal LOAM paper by Zhang et al. [
10
]
which is itself motivated by generalized-ICP work by Se-
gal et al. [
11
]. In our discussion we briefly review the relevant
literature.
A. Direct tightly coupled multi-lidar odometry
To develop real-time SLAM systems simple edge and plane
features are often extracted from the point clouds and tracked
between frames for computational efficiency [
3
], [
12
], [
13
].
Using IMU propagation, motion priors can then be used to
enable matching of point cloud features between key-frames.
However, this principle cannot be applied to featureless
environments. Hence, instead of feature engineering, the
whole point cloud is often processed which has an analogy to
processing the whole image in visual odometry methods such
as LSD-SLAM, [
14
], which is known as direct estimation.
To support direct methods, recently Xu et al. [15] proposed
the ikd-tree in their Fast-LIO2 work which efficiently inserts,
updates, searches and filters to maintain a local submap.
The ikd-tree achieves its efficiency through “lazy delete” and
“parallel tree re-balancing” mechanisms.
Furthermore, instead of point-wise operations the authors
of Faster-LIO [
16
] proposed voxel-wise operations for point
cloud association across frames and reported improved
efficiency. In our work we also maintain an ikd-tree of the
fused lidar measurements and tightly couple the relative lidar
poses, IMU preintegration and GNSS prior in our propose
estimator. Since, we jointly estimate the state based on the
residual cost function built upon the multiple modalities we
consider this a tightly coupled system.
While there have been many studies of state estimation
using single lidar and IMU, there is limited literature available
for fusing multi-lidar and MIMU systems. Our idea closely
resembles M-LOAM [
7
], where state estimation with multiple
lidars and calibration was performed. However, the working
principles are different as M-LOAM is not a direct method
and MIMU system is not considered in their work.
Finally, most authors do not address how to achieve
reliability in situations of signal loss — an issue which is
important for practical operational scenarios.
Fig. 2. Reference frames conventions for our vehicle platform. The world
frame
W
is a fixed frame, while the base frame
B
, as shown in Fig. 1, is
located at the rear axle center of the vehicle. Each sensor unit contains the
two optical frames
C
, an IMU frame,
I
, and lidar frame
L
. The cameras are
shown for illustration only and they are not used for this work.
III. PROBLEM STATEMENT
A. Sensor platform and reference frames
The sensor platform with its corresponding reference
frames is shown in Fig. 2 along with the illustrative sensor
fields-of-view in Fig. 1. Each of the sensor housings contain a
lidar with a corresponding embedded IMU and two cameras.
Although we do not use the cameras in this work they are
illustrated here to show the full sensor setup. We used logs
from a bus and a truck with similar sensor housings for our
experiments. The two lower mounted modules from the rear,
present in both the vehicles, are not shown here in the picture.
The embedded IMUs within the lidar sensors are used to
form the MIMU setup.
Now we describe the necessary notation and reference
frames used in our system according to the convention of
Furgale [
17
]. The vehicle base frame,
B
is located on the
center of the rear-axle of the vehicle. Sensor readings from
GNSS, lidars, cameras and IMUs are represented in their
respective sensor frames as
G
,
L(k)
,
C(k)
and
I(k)
respectively.
Here,
k∈[FL, FR, RL, RR]
denotes the location of the
sensor in the vehicle corresponding to front-left, front-right,
rear-left and rear-right respectively. The GNSS measurements
are reported in world fixed frame,
W
and transformed to
B
frame by performing a calibration routine outside the
scope of this work. In our discussions the transformation
matrix is denoted as,
T=R3×3t3×1
0>1∈SE(3)
and
RRT=I3×3, since the rotation matrix is orthogonal.
B. Problem formulation
Our primary goal is to estimate the position
t
W WB
, orienta-
tion
R
W WB
, linear velocity
v
W WB
, and angular velocity
ω
W WB
,
of the base frame
B
, relative to the fixed world frame
W
.
Additionally, we also estimate the MIMU biases
bg
B,ba
B
expressed in
B
frame, as that is where it can be sensed. Hence,
our estimate of vehicle’s state xiat time ti, is denoted as:
xi= [Ri,ti,vi,ωi,ba
i,bg
i]∈SE(3) ×R15,(1)
where, the corresponding measurements are in the frames
mentioned above.
IV. METHODOLOGY
A. Initialization
To provide an initial pose we use the GNSS measurements,
T
W WG
and determine an initial estimate of the starting yaw and