IMU Preintegration on Manifold - with errors
IMU Preintegration on Manifold - with errors
x [m]
z [m]
4
paper, we address this issue by preintegrating inertial measure-
ments between selected keyframes. The preintegration allows us 2 5
to accurately summarize hundreds of inertial measurements into
0
a single relative motion constraint. Our first contribution is a 0
preintegration theory that properly addresses the manifold struc- −2
Images:
k
Keyframes:
Keyframes
IMU:
3D Landmark Structureless Projection Factor
Pre-Int. IMU:
Fig. 4: Left: visual and inertial measurements in VIN. Right: factor graph in
which several IMU measurements are summarized in a single preintegrated Fig. 5: Different rates for IMU and camera.
IMU factor and a structureless vision factor constraints keyframes observing
the same landmark.
kinematic model [38, 39]:
Factor graphs and MAP estimation. A factor graph encodes
ṘWB = RWB B ω ∧
WB , W v̇ = W a, W ṗ = W v, (22)
the posterior probability of the variables Xk , given the avail-
able measurements Zk and priors p(X0 ):
which describes the evolution of the pose and velocity of B.
The state at time t + ∆t is obtained by integrating Eq. (22).
Y
p(Xk |Zk ) ∝ p(X0 )p(Zk |Xk ) = p(X0 ) p(Ci , Iij |Xk )
(i,j)∈Kk Applying Euler integration, which is exact assuming that W a
Y Y Y and B ω WB remain constant in the interval [t, t + ∆t], we get:
= p(X0 ) p(Iij |xi , xj ) p(zil |xi ). (18)
(i,j)∈Kk i∈Kk l∈Ci
RWB (t + ∆t) = RWB (t) Exp (B ω WB (t)∆t)
The terms in the factorization (18) are called factors. A W v(t + ∆t) = W v(t) + W a(t)∆t
schematic representation of the connectivity of the factor graph 1 2
underlying the problem is given in Fig. 4 (the connectivity of W p(t + ∆t) = W p(t) + W v(t)∆t + W a(t)∆t . (23)
2
the structureless vision factors will be clarified in Section VI).
The MAP estimate corresponds to the maximum of (18), or Using Eqs. (20)–(21), we can compute W a and B ω WB as a
equivalently, the minimum of the negative log-posterior. Under function of the IMU measurements, hence (23) becomes
the assumption of zero-mean Gaussian noise, the negative log-
ω̃(t) − bg (t) − η gd (t) ∆t
posterior can be written a sum of squared residual errors: R(t + ∆t) = R(t) Exp
v(t + ∆t) = v(t) + g∆t + R(t) ã(t)−ba (t)−η ad (t) ∆t
.
Xk? = arg min − loge p(Xk |Zk ) (19)
Xk 1
X X X p(t + ∆t) = p(t) + v(t)∆t + g∆t2
= arg min kr0 k2Σ0+ krIij k2Σij + krCil k2ΣC 2
Xk 1
+ R(t) ã(t)−b (t)−η ad (t) ∆t2 ,
a
(i,j)∈Kk i∈Kk l∈Ci (24)
2
where r0 , rIij , rCil are the residual errors associated to the
measurements, and Σ0 , Σij , and ΣC are the corresponding where we dropped the coordinate frame subscripts for read-
covariance matrices. The goal of the following sections is to ability (the notation should be unambiguous from now on).
provide expressions for the residual errors. The covariance of the discrete-time noise η gd is a function of
the sampling rate and relates to the continuous-time spectral
1
IV. IMU M ODEL AND M OTION I NTEGRATION noise η g via Cov(η gd (t)) = ∆t Cov(η g (t)), and the same
ad
An IMU measures the rotation rate and the acceleration of relation holds for η (cf., [40, Appendix]).
the sensor with respect to an inertial frame. The measurements,
namely B ã(t), and B ω̃ WB (t), are affected by additive white noise V. IMU P REINTEGRATION ON M ANIFOLD
η and a slowly varying sensor bias b:
This section contains the first key contribution of the paper.
B ω̃ WB (t) = B ω WB (t) + bg (t) + η g (t) (20) While Eq. (24) could be readily seen as a probabilistic
ã(t) = RT a a constraint in a factor graph, it would require to include states
B WB (t) (W a(t) − W g) + b (t) + η (t), (21)
in the factor graph at high rate. Here we show that all
In our notation, the prefix B denotes that the corresponding measurements between two keyframes at times k = i and
quantity is expressed in the frame B (c.f., Fig. 3). The pose k = j can be summarized in a single compound measurement,
of the IMU is described by the transformation {RWB , W p}, named preintegrated IMU measurement, which constrains the
which maps a point from sensor frame B to W. The vector motion between consecutive keyframes. This concept was first
3
B ω WB (t) ∈ R is the instantaneous angular velocity of B relative proposed in [26] using Euler angles and we extend it, by
to W expressed in coordinate frame B, while W a(t) ∈ R3 is the developing a suitable theory for preintegration on manifold.
acceleration of the sensor; W g is the gravity vector in world We assume that the IMU is synchronized with the camera
coordinates. We neglect effects due to earth’s rotation, which and provides measurements at discrete times k (cf., Fig. 5).1
amounts to assuming that W is an inertial frame.
The goal now is to infer the motion of the system from IMU 1 We calibrate the IMU-camera delay using the Kalibr toolbox [41]. An
measurements. For this purpose we introduce the following alternative is to add the delay as a state in the estimation process [42].
Iterating the IMU integration (24) for all ∆t intervals ∆vij in (26),
Substituting (28) back into the expression of
between k = i and k = j (c.f., Fig. 5), we find: using the approximation (4) for Exp −δφij , and dropping
j−1 higher-order noise terms, we obtain:
Y
Rj = Ri Exp ω̃ k − bgk − η gd
k ∆t , j−1
eq.(4) X
k=i ∆vij ' ∆R̃ik (I − δφ∧ a ad
ik ) (ãk −bi ) ∆t−∆R̃ik η k ∆t
j−1
X k=i
vj = vi + g∆tij + Rk ãk − bak − η ad
k ∆t (25) j−1
eq.(2) X ∧
∆R̃ik (ãk −bai ) δφik ∆t − ∆R̃ik η ad
k=i = ∆ṽij + k ∆t
j−1 j−1 k=i
X 1 1X
.
pj = pi + vk ∆t + g∆t2ij + Rk ãk −bak −η adk ∆t2 = ∆ṽ ij − δv ij (29)
2 2
k=i k=i
. Pj where we defined the preintegrated velocity measurement
where we introduced the shorthands ∆tij = ∆t and ∆ṽ = . Pj−1 a
k=i ∆R̃ik (ãk −bi ) ∆t and its noise δvij .
. k=i ij
(·)i = (·)(ti ) for readability. Similarly, substituting (28) in the expression of ∆pij
While Eq. (25) already provides an estimate of the motion in (26), and using the first-order approximation (4), we obtain:
between time ti and tj , it has the drawback that the integration
j−1 j−1
in (25) has to be repeated whenever the linearization point at eq.(4) X 3 ∧ a 2
X 3
time ti changes (intuitively, a change in the rotation Ri implies ∆p ij ' ∆ R̃ik (I−δφ ik ) (ã k −bi ) ∆t − ∆R̃ik η ad
k ∆t
2
2 2
k=i k=i
a change in all future rotations Rk , k = i, . . . , j −1, and makes
j−1
necessary to re-evaluate summations and products in (25)). eq.(2) X 3 a ∧ 2 3 ad 2
Our goal is to avoid repeated integrations. For this purpose, = ∆p̃ij + ∆R̃ik (ãk −bi ) δφik ∆t − ∆R̃ik η k ∆t
2 2
k=i
we define the following relative motion increments that are .
independent of the pose and velocity at ti : = ∆p̃ij − δpij (30)
. j−1
∆Rij = RT ω̃ k − bgk − η gd where we defined the preintegrated position measurement
Q
i Rj = k=i Exp k ∆t
. T Pj−1 ∆p̃ij and its noise δpij .
∆vij = Ri (vj −vi −g∆tij ) = k=i ∆Rik ãk −bak −η ad k ∆t Substituting the expressions (28), (29), (30) back in the
. 1
∆pij = RT i pj − pi − vi ∆tij − 2 g∆tij
2
original definition of ∆Rij , ∆vij , ∆pij in (26), we finally get
Pj−1
= k=i ∆vik ∆t + 21 ∆Rik ãk −bak −η ad
2
∆t our preintegrated measurement model:
k
∆R̃ij = RT
Pj−1 3
i Rj Exp δφij
= k=i 2 ∆Rik ãk −bak −η ad k ∆t2 (26)
where we defined ∆Rik = RT
. . ∆ṽij = RT i (vj −vi −g∆tij ) + δvij
i Rk and ∆vik = vk − vi .
Unfortunately, summations and products in (26) are still T 1 2
∆p̃ij = Ri pj − pi − vi ∆tij − g∆tij + δpij (31)
function of the bias. We tackle this problem in two steps. 2
In Section V-A, we assume bi is known; then, in Section V-C where our compound measurements are written as a function
we show how to avoid repeating the integration when the bias of the (to-be-estimated) state “plus” a random noise, described
estimate changes. In the rest of the paper, we assume that the by the random vector [δφT , δvT , δpT ]T . The nature of the
ij ij ij
bias remains constant between two keyframes: noise terms is discussed in the following section.
bgi = bgi+1 = . . . = bgj−1 , bai = bai+1 = . . . = baj−1 . (27)
B. Noise Propagation
A. Preintegrated IMU Measurements Let us start with the rotation noise:
In this section, we assume that the bias at time ti is known. . Qj−1
k gd
Exp −δφij = k=i Exp −∆R̃T k+1j Jr η k ∆t . (32)
We want to isolate the noise in (26). Therefore, starting with
the rotation increment ∆Rij , we use the first-order approxi- Taking the Log at both members and changing signs, we get:
mation (7) (rotation noise is “small”) and rearrange the terms,
by “moving” the noise to the end, using the relation (11): Q
j−1
k gd
j−1
δφij = −Log k=i Exp −∆R̃T
k+1j Jr η k ∆t . (33)
eq.(7) Yh i
∆Rij ' Exp ((ω̃ k − bgi ) ∆t) Exp −Jkr η gd
k ∆t Repeated application of the first-order approximation (9) (re-
k=i call that η gd
k as well as δφij are small rotation noises, hence
j−1
eq.(11) Y
k gd
the right Jacobians are close to the identity) produces:
= ∆R̃ij Exp −∆R̃T
k+1j Jr η k ∆t Pj−1 k gd
k=i δφij ' k=i ∆R̃T k+1j Jr η k ∆t (34)
.
= ∆R̃ij Exp −δφij (28)
Up to first order, the noise δφij is zero-mean and Gaussian, as
k . g
k
with Jr = Jr (ω̃ k −bi ). We defined the preintegrated rotation it is a linear combination of zero-mean noise terms η gd
k . This is
. Qj−1 g
measurement ∆R̃ij = k=i Exp ((ω̃ k − bi ) ∆t), and its desirable, since it brings the rotation measurement model (31)
noise δφij , which will be further analysed in the next section. exactly in the form (12).
Dealing with the noise terms δvij and δpij is now easy: E. Bias Model
these are linear combinations of the acceleration noise η ad k When presenting the IMU model (20), we said that biases
and the preintegrated rotation noise δφij , hence they are also are slowly time-varying quantities. Hence, we model them
zero-mean and Gaussian. with a “Brownian motion”, i.e., integrated white noise:
Therefore, we can fully characterize the noise as:
ḃg (t) = η bg , ḃa (t) = η ba . (38)
[δφT T T T
ij , δvij , δpij ] ∼ N (09×1 , Σij ). (35)
Integrating (38) over the time interval [ti , tj ] between two
The expression for the covariance Σij is provided in the consecutive keyframes i and j we get:
supplementary material [29], where we also show that both
the preintegrated measurements ∆R̃ij , ∆ṽij , ∆p̃ij , and the bgj = bgi + η bgd , baj = bai + η bad , (39)
. g
covariance Σij can be computed incrementally. where, as done before, we use the shorthand bgi= b (ti ),
bgd bad
C. Incorporating Bias Updates and we define the discrete noises η and η , which have
. .
zero mean and covariance Σbgd = ∆tij Cov(η bg ) and Σbad =
In the previous section, we assumed that the bias bi used to
∆tij Cov(η ba ), respectively (cf. [40, Appendix]).
compute the preintegrated measurements is given. However,
The model (39) can be readily included in our factor graph,
more likely, the bias estimate changes during optimization.
as a further additive term in (19) for all consecutive keyframes:
One solution would be to recompute the delta measurements
when the bias changes; however, that is computationally .
krbij k2 = kbgj − bgi k2Σbgd + kbaj − bai k2Σbad (40)
expensive. Instead, given a bias update b ← b̄ + δb, we can
update the delta measurements using a first-order expansion: VI. S TRUCTURELESS V ISION FACTORS
In this section we introduce our structureless model for
∂∆R̄
∆R̃ij (bgi ) ' ∆R̃ij (b̄gi ) Exp ∂bgij δbg
vision measurements. The key feature of our approach is the
∂∆v̄ij ∂∆v̄ij
∆ṽij (bgi , bai ) ' ∆ṽij (b̄gi , b̄ai ) + g
∂bg δbi + a
∂ba δbi
linear elimination of landmarks. Note that the elimination is
∂∆p̄ij ∂∆p̄ij repeated at each Gauss-Newton iteration, hence we are still
∆p̃ij (bgi , bai ) ' ∆p̃ij (b̄gi , b̄ai ) + g
∂bg δbi + a
∂ba δbi
(36) guaranteed to obtain the optimal MAP estimate.
This is similar to the bias correction in [26] but operates di- Visual measurements contribute to the cost (19) via the sum:
∂∆R̄ ∂∆v̄ PL P
rectly on SO(3). The Jacobians { ∂bgij , ∂bgij , . . .} (computed 2 2
P P
i∈Kk l∈Ci krCil kΣC = l=1 i∈X (l) krCil kΣC (41)
at b̄i ) describe how the measurements change due to a change
in the bias estimate. The derivation of the Jacobians is very which, on the right-hand-side, we rewrote as a sum of contri-
similar to the one we used in Section V-A to express the mea- butions of each landmark l = 1, . . . , L. In (41), X (l) denotes
surements as a large value plus a small perturbation; hence, the subset of keyframes in which l is seen.
we omit the complete derivation, which can be found in the A fairly standard model for the residual error of a single
supplementary material [29]. Note that the Jacobians remain pixel measurement zil is [13]:
constant and can be precomputed during the preintegration. rCil = zil − π(Ri , pi , ρl ), (42)
D. Preintegrated IMU Factors 3
where ρl ∈ R denotes the position of the l-th landmark, and
Given the preintegrated measurement model in (31) and π(·) is a standard perspective projection, which also encodes
since measurement noise is zero-mean and Gaussian up to the (known) IMU-camera transformation TBC .
first order (35), it is now easy to write the residual errors Direct use of (42) would require to include the landmark
. positions ρl , l = 1, . . . , L in the optimization, and this impacts
rIij = [rT T T T 9
∆Rij , r∆vij , r∆pij ] ∈ R , where
T negatively on computation. Therefore, in the following we
.
g ∂∆R̄ij g T adopt a structureless approach that avoids optimization over
r∆Rij = Log ∆R̃ij (b̄i )Exp ∂bg δb Ri Rj
the landmarks, thus ensuring to retrieve the MAP estimate.
.
r∆vij = RT (vj − vi − g∆tij ) As recalled in Section II-C, at each GN iteration, we lift the
hi i cost function, using the retraction (15). For the vision factors
∂∆v̄ ∂∆v̄
− ∆ṽij (b̄gi , b̄ai ) + ∂bgij δbg + ∂baij δba
this means that the original residuals (41) become:
. 1
r∆pij = RTi pj − pi − vi ∆tij − 2 g∆tij
2 PL P 2
h i l=1 i∈X (l) kzil − π̌(δφi , δpi , δρl )kΣC (43)
∂∆p̄ ∂∆p̄
− ∆p̃ij (b̄gi , b̄ai ) + ∂bgij δbg + ∂baij δba , (37)
where δφi , δpi , δρl are now Euclidean corrections, and π̌(·)
in which we also included the bias updates of Eq. (36). is the lifted cost function. The “solve” step in the GN method
According to the “lift-solve-retract” method (Section II-C), is based on linearization of the residuals:
at each GN iteration we need to re-parametrize (37) using the PL P 2
l=1 i∈X (l) kFil δTi + Eil δρl − bil k , (44)
retraction (15). Then, the “solve” step requires to linearize
.
the resulting cost. For the purpose of linearization, it is where δTi = [δφi δpi ]T ; the Jacobians Fil , Eil , and the vec-
1/2
convenient to compute analytic expressions of the Jacobians tor bil (both normalized by ΣC ) result from the linearization.
of the residual errors, which we derive in [29]. The vector bil is the residual error at the linearization point.
Writing the second sum in (44) in matrix form we get:
PL 2
l=1 kFl δTX (l) + El δρl − bl k (45)
2.5
avoids every-frame feature extraction, resulting in high-frame- 2.0
rate motion estimation. Combined with an outlier resistant 1.5
probabilistic triangulation method, SVO provides increased 1.0
0.5
robustness in scenes with repetitive and high frequency texture.
0.0
The computation of the MAP estimate in Eq. (19) is 10 40 90 160 250 360
Distance traveled [m]
based on iSAM2 [24], which is a state-of-the-art incremental
smoothing approach. iSAM2 exploits the fact that new mea- Fig. 7: Comparison of the proposed approach versus the ASLAM al-
gorithm [9] and an implementation of the MSCKF filter [20]. Relative
surements often have only local effect on the MAP estimate, errors are measured over different segments of the trajectory, of length
hence applies incremental updates directly to the square-root {10, 40, 90, 160, 250, 360}m, according to the odometric error metric in [46].
information matrix, only re-solving for the variables affected
Processing Time [ms]
40
by the new measurements. In odometry problems, the use of 35
30
iSAM2 results in constant-time updates. 25
20
15
10
5
VIII. E XPERIMENTS 0
0 200 400 600 800 1000
Keyframe ID
The first experiment shows that the proposed approach
is more accurate than two competitive state-of-the-art ap- Fig. 8: Processing-time per keyframe for the proposed VIN approach.
proaches, namely ASLAM [9], and MSCKF [20]. The ex-
periment is performed on the indoor trajectory of Fig. 6. Fig. 7 compares the proposed system against the ASLAM
The dataset was recorded with a forward-looking VI-Sensor algorithm [9], and an implementation of the MSCKF fil-
[44] that consists of an ADIS16448 MEMS IMU and two ter [20]. Both these algorithms currently represent the state-
embedded WVGA monochrome cameras (we only use the of-the-art in VIN, ASLAM for optimization-based approaches,
left camera). Intrinsic and extrinsic calibration was obtained and MSCKF for filtering methods. We obtained the datasets as
using [41]. The camera runs at 20Hz and the IMU at 800Hz. well as the trajectories computed with ASLAM and MSCKF
Ground truth poses are provided by a Vicon system mounted in from the authors of [9]. We use the relative error metrics
the room; the hand-eye calibration between the Vicon markers proposed in [46] to obtain error statistics. The metric eval-
and the camera is computed using a least-squares method [45]. uates the relative error by averaging the drift over trajectory
segments of different length ({10, 40, 90, 160, 250, 360}m in
2 http://github.com/uzh-rpg/rpg svo Fig. 7). Our approach exhibits less drift than the state-of-
3 http://borg.cc.gatech.edu the-art, achieving 0.3m drift on average over 360m travelled
0.000018 0.0005 0.0008
50
0.000004 0.0002
0.0001
20 0.000002 0.0001
0.000000 0.0000 0.0000
0.04 0.08 0.12 0.16 0.2 0.04 0.08 0.12 0.16 0.2 0.04 0.08 0.12 0.16 0.2
10 Bias Change Bias Change Bias Change
0
Fig. 10: Error committed when using the first-order approximation (36) instead
of repeating the integration, for different bias perturbations. Left: ∆R̃ij (b̄i +
−60 −40 −20 0 20 40 60 δbi ) error; Center: ∆ṽij (b̄i + δbi ) error; Right: ∆p̃ij (b̄i + δbi ) error.
x [m]
Statistics are computed over 1000 Monte Carlo runs.
Fig. 9: Outdoor trajectory (length: 300m) around a building with identical start
and end point at coordinates (0, 0, 0). The end-to-end error of the proposed approach: the a-posteriori bias correction of Section V-C. To
approach is 1.0m. Google Tango accumulated 2.2m drift. The green dots are
the 3D points triangulated from our trajectory estimate. evaluate the quality of the first-order approximation (36), we
performed the following Monte Carlo analysis. First, we com-
distance; ASLAM and MSCKF accumulate an average error puted the preintegrated measurements ∆R̃ij (b̄i ), ∆ṽij (b̄i ) and
of 0.7m. We observe significantly less drift in yaw direction ∆p̃ij (b̄i ) over 100 IMU measurements at a given bias esti-
in the proposed approach while the error in pitch and and roll mate b̄i . Then, we applied a perturbation δbi (in a random
direction is constant for all methods due to the observability direction with magnitude between 0.04 and 0.2) to both the
of the gravity direction. gyroscope and accelerometer bias. We repeated the integration
Figure 8 illustrates the time required by the back-end to at b̄i + δbi , and we obtained ∆R̃ij (b̄i + δbi ), ∆ṽij (b̄i + δbi )
compute the full MAP estimate, by running iSAM2 with 10 and ∆p̃ij (b̄i + δbi ). Finally, we compared the result of the
optimization iterations. The experiment was performed on a integration against the first-order correction in (36). Fig. 10
standard laptop (Intel i7, 2.4 GHz). The average update time reports statistics of the errors in the preintegrated variables
for iSAM2 is 10ms. The peak corresponds to the start of the between the re-integrated variables and the first-order correc-
experiment in which the camera was not moving. In this case tion, over 1000 Monte Carlo runs. The order of magnitude of
the number of tracked features becomes very large making the errors suggests that the first-order approximation captures
the back-end slightly slower. The SVO front-end requires very well the bias change, and can be safely used for relatively
approximately 3ms to process a frame on the laptop while large bias fluctuations.
the back-end runs in a parallel thread and optimizes only A video demonstrating the execution of our approach for
keyframes. Although the processing times of ASLAM were the real experiments discussed in this section can be viewed
not reported, the approach is described as computationally at https://youtu.be/CsJkci5lfco
demanding [9]. ASLAM needs to repeat IMU integration at IX. C ONCLUSION
every change of the linearization point, which we avoid by
We propose a novel preintegration theory that provides a
using the preintegrated IMU measurements.
grounded way to model a large number of IMU measurements
The second experiment is performed on an outdoor trajec- as a single motion constraint. Our proposal improves over
tory, and compares the proposed approach against the Google related works that perform integration in a global frame,
Tango Peanut sensor (mapper version 3.15), which is an e.g., [8, 20], as we do not commit to a linearization point
engineered VIN system. We rigidly attached the VI-Sensor to during integration. Moreover, it leverages the seminal work
a Tango device and walked around an office building. Fig. 9 on preintegration [26], bringing to maturity the preintegration
depicts the trajectory estimates for our approach and Google and uncertainty propagation in SO(3). We also discuss how
Tango. The trajectory starts and ends at the same location, to use the preintegrated IMU model in a VIN pipeline; we
hence we can report the end-to-end error which is 1.5m for adopt a structureless model for visual measurements which
the proposed approach and 2.2m for the Google Tango sensor. avoids optimizing over 3D landmarks. Our VIN approach uses
The third experiment is the one in Fig. 1. The trajectory iSAM2 to perform constant-time incremental smoothing.
goes across three floors of an office building and eventually An efficient implementation of our approach requires 10ms
returns to the initial location on the ground floor. Also in this to perform inference (back-end), and 3ms for feature tracking
case the proposed approach guarantees a very small end-to-end (front-end). We provide comparisons against state-of-the-art
error (0.5m), while Tango accumulates 1.4m error. alternatives, including filtering and optimization-based tech-
We remark that Tango and our system use different sensors, niques. We release the source-code of the IMU preintegration
hence the reported end-to-end errors only allow for a qualita- and the structurless vision factors in the GTSAM 4.0 optimiza-
tive comparison. However, the IMUs of both sensors exhibit tion toolbox [30] and provide additional theoretical derivations
similar noise characteristics [47, 48] and the Tango camera and implementation details in the supplementary material [29].
has a significantly larger field-of-view and better shutter speed
Acknowledgments The authors gratefully acknowledge Stephen
control than our sensor. Therefore, the comparison is still Williams and Richard Roberts for helping with an early implemen-
valuable to assess the accuracy of the proposed approach. tation in GTSAM, and Simon Lynen and Stefan Leutenegger for
The fourth experiment evaluates a specific aspect of our providing datasets and results of their algorithms.
R EFERENCES Conf. on Robotics and Automation (ICRA), pages 3565–3572,
April 2007.
[1] Google. Project tango. URL https://www.google.com/atap/ [21] E.S. Jones and S. Soatto. Visual-inertial navigation, mapping
projecttango/. and localization: A scalable real-time causal approach. Intl. J.
[2] A. Martinelli. Vision and IMU data fusion: Closed-form solu- of Robotics Research, 30(4), Apr 2011.
tions for attitude, speed, absolute scale, and bias determination. [22] G.P. Huang, A.I. Mourikis, and S.I. Roumeliotis. An
IEEE Trans. Robotics, 28(1):44–60, 2012. observability-constrained sliding window filter for SLAM. In
[3] S-H. Jung and C.J. Taylor. Camera trajectory estimation using IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS),
inertial sensor measurements and structure fom motion results. pages 65–72, 2011.
In IEEE Conf. on Computer Vision and Pattern Recognition [23] M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Incremental
(CVPR), 2001. smoothing and mapping. IEEE Trans. Robotics, 24(6):1365–
[4] D. Sterlow and S. Singh. Motion estimation from image and 1378, Dec 2008.
inertial measurements. 2004. [24] M. Kaess, H. Johannsson, R. Roberts, V. Ila, J. Leonard, and
[5] A.I. Mourikis and S.I. Roumeliotis. A dual-layer estimator F. Dellaert. iSAM2: Incremental smoothing and mapping using
architecture for long-term localization. In Proc. of the Workshop the Bayes tree. Intl. J. of Robotics Research, 31:217–236, Feb
on Visual Localization for Mobile Platforms at CVPR, Anchor- 2012.
age, Alaska, June 2008. [25] V. Indelman, A. Melim, and F. Dellaert. Incremental light
[6] G. Sibley, L. Matthies, and G. Sukhatme. Sliding window filter bundle adjustment for robotics navigation. In IEEE/RSJ Intl.
with application to planetary landing. J. of Field Robotics, 27 Conf. on Intelligent Robots and Systems (IROS), November
(5):587–608, 2010. 2013.
[7] T-C. Dong-Si and A.I. Mourikis. Motion tracking with fixed-lag [26] T. Lupton and S. Sukkarieh. Visual-inertial-aided navigation
smoothing: Algorithm consistency and analysis. In IEEE Intl. for high-dynamic motion in built environments without initial
Conf. on Robotics and Automation (ICRA), 2011. conditions. IEEE Trans. Robotics, 28(1):61–76, Feb 2012.
[8] S. Leutenegger, P. Furgale, V. Rabaud, M. Chli, K. Konolige, [27] M. Moakher. Means and averaging in the group of rotations.
and R. Siegwart. Keyframe-based visual-inertial slam using SIAM Journal on Matrix Analysis and Applications, 24(1):1–16,
nonlinear optimization. In Robotics: Science and Systems (RSS), 2002.
2013. [28] L. Carlone, Z. Kira, C. Beall, V. Indelman, and F. Dellaert.
[9] S. Leutenegger, S. Lynen, M. Bosse, R. Siegwart, and P. Furgale. Eliminating conditionally independent sets in factor graphs: A
Keyframe-based visual-inertial slam using nonlinear optimiza- unifying perspective based on smart factors. In IEEE Intl. Conf.
tion. Intl. J. of Robotics Research, 2014. on Robotics and Automation (ICRA), 2014.
[10] N. Keivan, A. Patron-Perez, and G. Sibley. Asynchronous [29] C. Forster, L. Carlone, F. Dellaert, and D. Scaramuzza. Sup-
adaptive conditioning for visual-inertial SLAM. In Intl. Sym. plementary material to: IMU preintegration on manifold for
on Experimental Robotics (ISER), 2014. efficient visual-inertial maximum-a-posteriori estimation. Tech-
[11] D.-N. Ta, K. Ok, and F. Dellaert. Vistas and parallel tracking and nical Report GT-IRIM-CP&R-2015-001, Georgia Institute of
mapping with wall-floor features: Enabling autonomous flight Technology, 2015.
in man-made environments. Robotics and Autonomous Systems, [30] Frank Dellaert. Factor graphs and GTSAM: A hands-on intro-
62(11):1657–1667, 2014. doi: http://dx.doi.org/10.1016/j.robot. duction. Technical Report GT-RIM-CP&R-2012-002, Georgia
2014.03.010. Special Issue on Visual Control of Mobile Robots. Institute of Technology, 2012.
[12] S. Shen. Autonomous Navigation in Complex Indoor and [31] G. S. Chirikjian. Stochastic Models, Information Theory, and
Outdoor Environments with Micro Aerial Vehicles. PhD Thesis, Lie Groups, Volume 2: Analytic Methods and Modern Applica-
University of Pennsylvania, 2014. tions (Applied and Numerical Harmonic Analysis). Birkhauser,
[13] V. Indelman, S. Wiliams, M. Kaess, and F. Dellaert. Information 2012.
fusion in navigation systems via factor graph based incremental [32] Y. Wang and G.S. Chirikjian. Nonparametric second-order
smoothing. Robotics and Autonomous Systems, 61(8):721–738, theory of error propagation on motion groups. Intl. J. of
August 2013. Robotics Research, 27(11–12):1258–1273, 2008.
[14] M. Bryson, M. Johnson-Roberson, and S. Sukkarieh. Airborne [33] T. D. Barfoot and P. T. Furgale. Associating uncertainty with
smoothing and mapping using vision and inertial sensors. In three-dimensional poses for use in estimation problems. IEEE
IEEE Intl. Conf. on Robotics and Automation (ICRA), pages Trans. Robotics, 30(3):679–693, 2014.
3143–3148, 2009. [34] Y. Wang and G.S. Chirikjian. Error propagation on the euclidean
[15] A. Patron-Perez, S. Lovegrove, and G. Sibley. A spline-based group with applications to manipulator kinematics. IEEE Trans.
trajectory representation for sensor fusion and rolling shutter Robotics, 22(4):591–602, 2006.
cameras. Intl. J. of Computer Vision, February 2015. [35] K.A. Gallivan P.A. Absil, C.G. Baker. Trust-region methods on
[16] H. Strasdat, J.M.M. Montiel, and A.J. Davison. Real-time Riemannian manifolds. Foundations of Computational Mathe-
monocular SLAM: Why filter? In IEEE Intl. Conf. on Robotics matics, 7(3):303–330, 2007.
and Automation (ICRA), pages 2657–2664, 2010. [36] S. T. Smith. Optimization techniques on Riemannian manifolds.
[17] G. Klein and D. Murray. Parallel tracking and mapping on a Hamiltonian and Gradient Flows, Algorithms and Control,
camera phone. In IEEE and ACM Intl. Sym. on Mixed and Fields Inst. Commun., Amer. Math. Soc., 3:113–136, 1994.
Augmented Reality (ISMAR), 2009. [37] J. H. Manton. Optimization algorithms exploiting unitary
[18] E.D. Nerurkar, K.J. Wu, and S.I. Roumeliotis. C-KLAM: Con- constraints. IEEE Trans. Signal Processing, 50:63–650, 2002.
strained keyframe-based localization and mapping. In Robotics: [38] R.M. Murray, Z. Li, and S. Sastry. A Mathematical Introduction
Science and Systems (RSS), 2013. to Robotic Manipulation. CRC Press, 1994.
[19] G. Klein and D. Murray. Parallel tracking and mapping for [39] J.A. Farrell. Aided Navigation: GPS with High Rate Sensors.
small AR workspaces. In IEEE and ACM Intl. Sym. on Mixed McGraw-Hill, 2008.
and Augmented Reality (ISMAR), pages 225–234, Nara, Japan, [40] J.L. Crassidis. Sigma-point Kalman filtering for integrated GPS
Nov 2007. and inertial navigation. IEEE Trans. Aerosp. Electron. Syst., 42
[20] A.I. Mourikis and S.I. Roumeliotis. A multi-state constraint (2):750–756, 2006.
Kalman filter for vision-aided inertial navigation. In IEEE Intl. [41] P. Furgale, J. Rehder, and R. Siegwart. Unified temporal and
spatial calibration for multi-sensor systems. In IEEE/RSJ Intl.
Conf. on Intelligent Robots and Systems (IROS), 2013.
[42] M. Li and A.I. Mourikis. Online temporal calibration for
camera-imu systems: Theory and algorithms. Intl. J. of Robotics
Research, 33(6), 2014.
[43] C. Forster, M. Pizzoli, and D. Scaramuzza. SVO: Fast Semi-
Direct Monocular Visual Odometry. In IEEE Intl. Conf. on
Robotics and Automation (ICRA), 2014. doi: 10.1109/ICRA.
2014.6906584.
[44] J. Nikolic, J., Rehderand M. Burri, P. Gohl, S. Leutenegger,
P. Furgale, and R. Siegwart. A Synchronized Visual-Inertial
Sensor System with FPGA Pre-Processing for Accurate Real-
Time SLAM. In IEEE Intl. Conf. on Robotics and Automation
(ICRA), 2014.
[45] F.C. Park and B.J. Martin. Robot sensor calibration: Solving
AX=XB on the euclidean group. 10(5), 1994.
[46] A. Geiger, P. Lenz, and R. Urtasun. Are we ready for au-
tonomous driving? the KITTI vision benchmark suite. In IEEE
Conf. on Computer Vision and Pattern Recognition (CVPR),
pages 3354–3361, Providence, USA, June 2012.
[47] Tango IMU specifications. URL http://ae-bst.
resource.bosch.com/media/products/dokumente/bmx055/
BST-BMX055-FL000-00 2013-05-07 onl.pdf.
[48] Adis IMU specifications. URL http://www.analog.com/media/
en/technical-documentation/data-sheets/ADIS16448.pdf.