0% found this document useful (0 votes)
66 views

Dissertation Myung PDF

Uploaded by

Antonio Flores
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

Dissertation Myung PDF

Uploaded by

Antonio Flores
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 143

Vision-Based Navigation for a Small Fixed-Wing

Airplane in Urban Environment

Myung Hwangbo

CMU-RI-TR-12-11

Submitted in partial fulfillment of the


requirements for the degree of
Doctor of Philosophy in Robotics.

The Robotics Institute


Carnegie Mellon University
Pittsburgh, PA 15213

May 2012

Thesis Committee:
Takeo Kanade (chair)
Omead Amidi
Sanjiv Singh
Randy Beard, Brigham Young University

c 2012 Myung Hwangbo. All right reserved.


Copyright
Keywords: Unmanned aerial vehicle, fixed-wing airplane, autonomous navigation, vision-
based attitude estimation, sensor calibration, sampling-based motion planning, air-slalom task
This dissertation is dedicated to

my lovely wife, Hyojin,

who has been my greatest support, encouragement and shelter


during my time at Carnegie Mellon.
iv
Abstract

An urban operation of unmanned aerial vehicles (UAVs) demands a high level of


autonomy for tasks presented in a cluttered environment. While fixed-wing UAVs are
well suited for long-endurance missions at a high altitude, enabling them to navigate
inside an urban area brings another level of challenges. Their inability to hover and
low agility in motion cause more difficulties on finding a feasible path to move safely
in a compact region, and the limited payload allows only low-grade sensors for state
estimation and control.
We address the problem of achieving vision-based autonomous navigation for a
small fixed-wing in an urban area with contributions to the following several key
topics. Firstly, for robust attitude estimation during dynamic maneuvering, we take
advantage of the line regularity in an urban scene, which features vertical and hor-
izontal edges of man-made structures. The sensor fusion with gravity-related line
segments and gyroscopes in a Kalman filter can provide driftless and realtime atti-
tude for flight stabilization. Secondly, as a prerequisite to sensor fusion, we present
a convenient self-calibration scheme based on the factorization method. Natural ref-
erences such as gravity, vertical edges, and distant scene points, available in urban
fields, are sufficient to find intrinsic and extrinsic parameters of inertial and vision
sensors. Lastly, to generate a dynamically feasible motion plan, we propose a discrete
planning method that encodes a path into interconnections of finite trim states, which
allow a significant dimension reduction of a search space and result in naturally im-
plementable paths integrated with flight controllers. The most probable path to reach
a target is computed by the Markov Decision Process with motion uncertainty due to
wind, and a minimum target observation time is imposed on the final motion plan to
consider a camera’s limited field-of-view.
In this thesis, the effectiveness of our vision-based navigation system is demon-
strated by what we call an ”air slalom” task in which the UAV must autonomously
search and localize multiple gates, and pass through them sequentially. Experiment
results with a 1m wing-span airplane show essential navigation capabilities demanded
in urban operations such as maneuvering passageways between buildings.
vi
Contents

1 Introduction 1
1.1 Autonomy in Urban Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Vision-based Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Fixed-wing UAV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Air Slalom Task 5


2.1 Task Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Navigation System Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Visual/Inertial Attitude Estimation 9


3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Attitude from Parallel Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Vertical vanishing point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 Horizontal vanishing point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Visual/Inertial Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3.1 Process model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Line measurement update . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Line Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 RANSAC-based vanishing point detection . . . . . . . . . . . . . . . . . . . 18
3.4.2 Hierarchical line grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Horizon Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.1 Horizon hypothesis generation . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.2 Horizon hypothesis test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.6 Performance Evaluation in Graphical Simulator . . . . . . . . . . . . . . . . . . . . 27
3.7 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7.1 Outdoor Urban Scene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

vii
viii CONTENTS

3.7.2 Comparison with inertial-only and vision-only methods . . . . . . . . . . . 32

4 Visual/Inertial Sensor Calibration 37


4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Factorization Method for IMU Self-Calibration . . . . . . . . . . . . . . . . . . . . 39
4.2.1 Matrix reconstruction and load constraints . . . . . . . . . . . . . . . . . . 41
4.2.2 Natural reference features for self-calibration . . . . . . . . . . . . . . . . . 42
4.2.3 Redundant vs. triad IMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Redundant IMU Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.1 Shape recovery procedure from load constraints . . . . . . . . . . . . . . . . 43
4.3.2 Linear solution space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Triad IMU Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4.1 Iterative factorization algorithm . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.2 Bias convergence region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Simulation Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.6 Camera/IMU Calibration for Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . 52
4.6.1 Camera vs. accelerometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.6.2 Camera vs. gyroscope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.7 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Motion Planning and Control 67


5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Motion primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.1 Trim states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.2 Generation of motion primitives . . . . . . . . . . . . . . . . . . . . . . . . 70
5.3 Search-based Motion Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.3.1 3D Dubins heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.3.2 Greedy best-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.4 MDP-based Motion Planning under Motion Uncertainty . . . . . . . . . . . . . . . 76
5.4.1 Probabilistic model of a motion primitive . . . . . . . . . . . . . . . . . . . 76
5.4.2 State space discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4.3 Markov Decision Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.5 Target visibility constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5.1 Minimum target observation time . . . . . . . . . . . . . . . . . . . . . . . . 81
5.6 Motion Plans in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.7 Entering a gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.7.1 Vector field along a straight line . . . . . . . . . . . . . . . . . . . . . . . . 84
CONTENTS ix

6 Air-Slalom Experiment 87
6.1 UAV system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1.1 Hardware and Software Architecture . . . . . . . . . . . . . . . . . . . . . . 87
6.1.2 UAV Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Virtual gate and target detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.3 Vehicle and target state estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4.1 Single-gate air slalom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4.2 Multiple-gate air slalom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

7 Conclusion and Future Work 107


7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

A Derivation of Jacobian for Line Angle 111


A.1 Derivation of Jacobian in EKF Prediction . . . . . . . . . . . . . . . . . . . . . . . 111
A.2 Derivation of Jacobian in EKF Update . . . . . . . . . . . . . . . . . . . . . . . . . 111

B Discussion on Factorization-based Calibration 115


B.1 Practical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
B.2 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
B.3 Minimum number of load constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 116
B.4 Gravity magnitude constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
B.5 Degenerate load configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Bibliography 121
x CONTENTS
List of Figures

1.1 Prospective urban operations of a fixed-wing UAV . . . . . . . . . . . . . . . . . . 2


1.2 Three typical types of small and micro UAVs . . . . . . . . . . . . . . . . . . . . . 3

2.1 Air-slalom task for vision-based UAV navigation as a benchmark test . . . . . . . . 6


2.2 Diagram of a vision-based navigation system for the air-slalom task . . . . . . . . . 7

3.1 Visual features for attitude estimation in urban scenes at different altitudes . . . . 10
3.2 Horizon and vanishing points in a Manhattan world . . . . . . . . . . . . . . . . . 12
3.3 Representation of vanishing points on a Gaussian sphere . . . . . . . . . . . . . . . 13
3.4 Coordinate transformation between the world frame to the camera frame . . . . . 14
3.5 Sensor fusion diagram for visual/inertial attitude estimation in the EKF . . . . . . 15
3.6 Observation model of a line segment in the Kalman update . . . . . . . . . . . . . 16
3.7 Determination of a desired vanishing point for horizontal line measurements . . . . 18
3.8 Effect of K-means clustering (k = 2) in line classification . . . . . . . . . . . . . . . 21
3.9 Hierarchical line clustering results on simulated Manhattan-world images . . . . . 24
3.10 Hierarchical line clustering results on aerial urban images . . . . . . . . . . . . . . 25
3.11 Simulated images in a Manhattan world . . . . . . . . . . . . . . . . . . . . . . . . 28
3.12 Simulation evaluation of the visual/inertial attitude estimation method . . . . . . 30
3.13 Comparison of attitude estimation performance between partial, full and redundant
information of the horizon measurement H in the experiment . . . . . . . . . . . . 32
3.14 Plots of attitude error and covariance of the visual/inertial attitude estimation in
the experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.15 Attitude and vanishing points plotted on the Gaussian sphere . . . . . . . . . . . . 34
3.16 Comparison of average and maximum attitude estimation errors between the vision-
only, inertial-only, and visual/inertial attitude estimation methods . . . . . . . . . 35
3.17 Plots of attitude errors of the vision-only, inertial-only and visual/inertial attitude
estimation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

xi
xii LIST OF FIGURES

4.1 Calibration parameters of a camera/IMU system . . . . . . . . . . . . . . . . . . . 40


4.2 Gravity magnitude constraint Cτ∗=1 for accelerometer calibration . . . . . . . . . . 42
4.3 Angular speed constraint C ∗ for gyroscope calibration . . . . . . . . . . . . . . . . 42
4.4 Three cases on the number of solutions for q when rank(L) = 9 . . . . . . . . . . . 44
4.5 IMU calibration performance with respect to component size and noise . . . . . . . 49
4.6 IMU calibration performance with respect to load size and noise . . . . . . . . . . 49
4.7 IMU calibration performance with mixed constraint types . . . . . . . . . . . . . . 49
4.8 Bias convergence region of the triad-IMU calibration . . . . . . . . . . . . . . . . . 50
4.9 Systematic error of camera rotation angle θc obtained from image homography . . 52
4.10 Camera/accelerometer calibration using gravity and vertical edges . . . . . . . . . 53
4.11 Camera/gyroscope calibration using image feature tracks . . . . . . . . . . . . . . 54
4.12 Calibration experiment system consisting of two tri-axial IMUs and a camera . . . 56
4.13 Plots of accelerometer measurements during tripod operation . . . . . . . . . . . . 56
4.14 Plots of estimation variances of accelerometer calibration parameters . . . . . . . . 57
4.15 Reconstructed shape and motion matrix of two tri-axial accelerometers (n = 6) . . 57
4.16 Convergence rate of the bias in the iterative linear calibration (n = 3) . . . . . . . 58
4.17 Feature correspondences of distant scene points and corresponding gyroscope mea-
surements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.18 Reconstructed shape and motion matrix of two tri-axial gyroscopes (n = 6) . . . . 61
4.19 Vertical line segments for camera/accelerometer calibration . . . . . . . . . . . . . 62
4.20 Feature correspondences and gyroscope measurements for online calibration during
the aerial maneuvering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.1 Trim states of a fixed-wing airplane and a finite state machine between trim states 69
5.2 Reference profile for roll and pitch angles in the building of lateral and longitudinal
motion primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 A set of lateral and longitudinal motion primitives at discretized trim states . . . . 71
5.4 Reachability tree expanded by 9 lateral and 5 longitudinal motion primitives . . . 71
5.5 2D and 3D Dubins heuristic functions for the optimal distance to a goal . . . . . . 72
5.6 2D motion planning examples of the greedy best-first search . . . . . . . . . . . . . 75
5.7 3D motion planning examples of the greedy best-first search . . . . . . . . . . . . . 75
5.8 A set of lateral motion primitives used in the MDP-based motion planning . . . . 77
5.9 Uncertainty of lateral motion primitives in the real world . . . . . . . . . . . . . . 78
5.10 Probabilistic model of a state transition in a Markov Decision Process . . . . . . . 78
5.11 MDP-based optimal motion plans in an obstacle-free 2D space . . . . . . . . . . . 80
5.12 Comparison between stochastic MDP-based optimal path and deterministic short-
est path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
LIST OF FIGURES xiii

5.13 Visibility goal region Gv as an intermediate goal in two-step MDP planning scheme 82
5.14 Comparison between MDP motion plans when the target visibility constraint is
ignored or considered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.15 3D MDP-based optimal motion plans . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.16 Vector field of a desired heading angle along a straight line for gate entering . . . . 85

6.1 UAV system architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88


6.2 UAV testbed platform and onboard components . . . . . . . . . . . . . . . . . . . 89
6.3 Experimental setup for the air slalom task using virtual gates conceivable from
ground targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Ground target detection using color matching and image blobing . . . . . . . . . . 91
6.5 An Unscented Kalman filter for vehicle and target state estimation . . . . . . . . . 92
6.6 Plots of the target state estimation at the gate search stage . . . . . . . . . . . . . 98
6.7 Trajectory plot of the single-gate air slalom task . . . . . . . . . . . . . . . . . . . 99
6.8 Progress of MDP motion plans in the single-gate air slalom task . . . . . . . . . . 100
6.9 Gate entrance by the straight line following and plots of heading and roll angles . . 101
6.10 Plots of autopilot controls for lateral and longitudinal motions in the single-gate
air slalom task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.11 Trajectory plot of the multiple-gate air slalom task . . . . . . . . . . . . . . . . . . 104
6.12 Progress of MDP motion plans in the multiple-gate air slalom task . . . . . . . . . 105
6.13 Plots of autopilot controls for lateral and longitudinal motions in the multiple-gate
air slalom task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

B.1 Geometric interpretation of calibrating load sets F in Cτ∗=1 . . . . . . . . . . . . . . 119


B.2 Degenerate and non-degenerate calibrating loads F in Cτ∗=1 . . . . . . . . . . . . . 119
xiv LIST OF FIGURES
List of Tables

3.1 Comparison between sensor fusion methods that combine gyroscopes with other
complementary sensors for driftless attitude estimation . . . . . . . . . . . . . . . . 11
3.2 Comparison of attitude estimation performance between partial, full and redundant
information of the horizon measurement H . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Computation time for classified vertical and horizontal line measurements . . . . . 29

4.1 Comparison between sensor calibration methods for inertial and visual sensors . . 39
4.2 List of solution schemes for the redundant IMU calibration according to rank(L) . 45
4.3 Simulation evaluation of calibration performance using Cτ∗=1 . . . . . . . . . . . . . 50
4.4 Failure rate of the factorization for a redundant IMU when Cτ∗=1 is used . . . . . . 50
4.5 Experiment results on accelerometer calibration for redundant and tri-axial IMUs . 57
4.6 Experiment results on gyroscope self-calibration . . . . . . . . . . . . . . . . . . . . 61
4.7 Comparison of calibration results between the factorization-based method and con-
strained nonlinear optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.8 Experiment results on camera/accelerometer calibration . . . . . . . . . . . . . . . 63
4.9 Experiment results on camera/accelerometer calibration in the minimal condition . 63
4.10 Experiment results of online gyroscope calibration during aerial maneuvering . . . 65

6.1 Specifications of the low-cost UAV system . . . . . . . . . . . . . . . . . . . . . . . 88

xv
xvi LIST OF TABLES
Chapter 1

Introduction

Unmanned Aerial Vehicles (UAVs) have been under development since the beginning of flight
because they can eliminate the risk to a pilot’s life and they are generally inexpensive to procure.
Most UAVs are controlled remotely from a ground control station but automation systems are
being progressively integrated for autonomous operation. GPS (Global Positioning System) and
other sophisticated navigation systems make it possible for UAVs to take over high-risk aerial
missions that required manned aircraft.
The main role of UAVs thus far has been information gathering. UAVs’ aerial viewpoint pro-
vides a better perspective; they also have the ability to cover a large area and with higher speeds
than ground vehicles, and explore a viewpoint that is not constrained to a road network. While
intelligence, surveillance and reconnaissance missions still remain their predominant tasks, their
roles are expanding to diverse civilian, federal and commercial areas including law enforcement,
environment monitoring, network connection and communications relay [1].
One active research direction for UAVs is a development of miniature systems such as man-
portable UAVs or MAVs (Micro Air Vehicles) whose wingspan is less than 6 inches [2]. As
hardware continues to shrink in size, weight and cost, there is a dramatic rise in both interest and
application for miniaturized UAVs. Small size and improved endurance can provide a significant
advantage in urban and other high-density environments.

1.1 Autonomy in Urban Applications

In contrast to the recognized value of military applications, it is rare to encounter successful


UAV deployment in civil applications. This is because safety remains an important issue in civil
airspace and UAVs can be a hazard to other aircraft, ground vehicles, people and facilities. Near-
human capabilities, which are quite challenging to design, are required to ensure safety as the
Federal Aviation Administration (FAA) has a regulation that UAVs must have onboard detect,

1
2 CHAPTER 1. INTRODUCTION

Figure 1.1: Prospective urban operations of a fixed-wing UAV: air-to-ground target tracking, surveillance
in urban canyons, and UAV & UGV collaborative mission.

see and avoid capabilities to prevent in-air collisions [1].


Autonomous flight refers to the UAVs’ capability to carry out an entire task independently
without input from a remote operator. Prospective UAV urban operations in Figure 1.1 show
that the number and complexity of obstacles and the density of operations increase the number of
decisions that must be made, and compress the time required for the decisions. The importance
and difficulty of this problem motivates the development of increased levels of autonomy to
support urban UAV operations.

1.2 Vision-based Navigation

One of the critical requirements to make a UAV autonomous and practical is a reliable estimation
of its position and orientation as well as a 3D mapping of its surrounding environment. Among
many possible sensors for these purposes, a camera has been acclaimed for its unique advantage
to deliver multi-layered information in the format of images. In contrast to navigational sensors
such as GPS and INS (Inertial Navigation System), which provide information only about the
vehicle’s own motion with respect to the inertial frame, vision can provide additional information
relative to the environment, for example, how close the vehicle is to an obstacle, where targets
are located in an area, or how the vehicle is aligned with the horizon.
In addition to GPS, which may stop working in the shadow of satellite visibility, vision always
works in a cluttered urban environment and even indoors as long as the captured images have
sufficient texture and illumination. The drawback is that a substantial computational burden is
required to extract useful information from images, and the information is scene-dependent.

1.3 Fixed-wing UAV

There are three common types of miniaturized UAVs, as shown in Figure 1.2. Each type has its
own advantages and disadvantages, and different task scenarios call for different types of UAVs.
1.4. PROBLEM STATEMENT 3

Figure 1.2: Typical types of small and micro UAVs: an airplane-like fixed-wing model, bird or insect-like
ornithopter (flapping wing) model, and helicopter-like rotary wing models (ducted fan and quad rotor).

For instance, fixed wings (drones) are well suited for tasks that require high endurance and
extended loitering times, but generally they do not work indoors as they cannot hover or make
tight turns. Rotary wings, on the other hand, allow hovering and movement in any direction at
the cost of short flight time and complex design. Flapping wings may offer the most potential
for miniaturization and maneuverability, but in terms of real implementation, they continue to
be far inferior to the other UAV types.
Fixed wings have unique benefits such as readiness to fly and glide, lower running cost and a
wide working area. Lacking the ability to hover, however, causes challenges for motion planning
and obstacle avoidance in complex environments due to minimum forward velocity and limited
agility in motion.

1.4 Problem Statement

In this thesis, we will investigate autonomous algorithms that enable a small fixed-wing vehicle
to navigate in an urban environment at low altitudes and achieve a given task in the following
operational situations:

Commodity-level fixed-wing airplane Our UAV system development is targeted to au-


tomate an ordinary remote-controlled fixed-wing airplane, which is readily available in market
for hobbyists. No special design of a wing’s airfoil shape or control surface for a specific task is
considered. A high budget-performance ratio is expected from nearly the simplest platform that
can offer aerial navigation.

Low-grade sensors Due to the small payload allowed for a small UAV, the sensing capability
is subject to simple and unsophisticated lightweight onboard sensors. A low-cost avionics system
equipped with on-chip GPS, IMU and pressure sensors is considered. The camera is restricted to
one with a low-resolution image and limited field-of-view lens due to the payload constraint and
subsequent small onboard computational power.
4 CHAPTER 1. INTRODUCTION

Sensor fusion and calibration Vehicle and target states required in a given navigation task
are estimated from low-grade sensors. Sensor fusion is therefore necessary to combine sensors’
complementary measurements and yield a reliable and realtime state estimation. Prior to the
fusion, sensor calibration for an IMU and camera is a necessary step. It would be more favorable
if the calibration can be done in field because the sensors are subject to replacement and frequent
setting change in an expendable system.
Urban environment The environment of interest is mostly composed of man-made struc-
tures such as buildings, towers and traffic roads. Rather than a series of steady flights between
way-points, an urban operation requires continuous dynamic maneuvering at a low altitude, and
deliberate motion plans to reach a goal in a complex environment where close interactions often
occur with the surroundings. A demonstrative task asks the UAV to pass through multiple gates
in different orientations, which emulates common situations in an urban area such as maneuvering
through passageways between buildings.

1.5 Thesis Statement


In this thesis, we will present a vision-based navigation framework for a small, unmanned fixed-
wing airplane operating in an urban environment. The core problems we address here are the
development and integration of new algorithms on attitude estimation, sensor calibration, and
motion planning. The effectiveness of our framework is evaluated and demonstrated through an
air-slalom task. This thesis makes the following key statements:

• It is possible to produce a driftless and realtime attitude estimation for flight control when
gravity-related line segments in urban scene are combined with inertial sensing.
• Convenient and on-site sensor calibration is available for inertial and visual sensor fusion
from natural references such as gravity, vertical edges and distant scene points.
• The use of trim states of a fixed-wing is efficient to build discrete primitive motions and
generate a feasible and practically implementable motion plan.

The framework and methods developed in this thesis can be readily transferrable to other
fixed-wing platforms and other urban task scenarios. See Figure 2.1 for the overview of the thesis
organization.
Chapter 2

Air Slalom Task

One of the most common maneuvers demanded for a fixed-wing UAV in an urban environment
is to approach a target or region of interest in a specified direction. This kind of maneuver is
necessary, for example, when the UAV is in the middle of entering a gate or passageway between
buildings, following a target moving on a road network, or landing on a runway or driveway.
Unlike rotor-type air vehicles that can make a standing rotation, the change of an approaching
direction raise a challenge to fixed-wings because they require not only space and time to make
a turn but also a deliberate motion plan to meet the new direction.
This fact leads us to design a benchmark mission in what we call an air-slalom task, which is
focused on the evaluation and demonstration of vision-based autonomous navigation algorithms
for a fixed-wing UAV by presenting a challenging course. As illustrated in Figure 2.1, an air vehicle
needs to complete tight turns through a slalom course consisting of multiple gates. In order to
precisely enter multiple gates in a row, it is essential to integrate advanced realtime motion
planning as well as reliable target and vehicle state estimation into the navigation system. Like
the Red Bull Air Race for human pilots [3], this benchmark task can be used to compete with
other intelligent UAV systems for completing the task in the fastest time.

2.1 Task Description

Prior to flight, the location and orientation of gates are completely unknown to a UAV, but their
geometric shapes and colors are indicated so that their poses are visually identifiable. The UAV
needs to search, localize and pass through this series of gates in a designated order with no human
intervention. These multiple gates are placed with different entrance orientations in a twisting
and potentially difficult manner. Flying too low or high with respect to each gate disqualifies the
successful gate entrance.

5
6 CHAPTER 2. AIR SLALOM TASK

Figure 2.1: An air-slalom task designed as a benchmark test of an autonomous vision-based navigation
system of a small fixed-wing UAV in an urban environment

2.2 Navigation System Requirement

A high-level autonomous system for mission activities in aerial navigation involves the proper
integration of many intelligent sub-systems. Figure 2.2 shows a diagram of functional blocks
generally required to surmount the air-slalom task when conventional avionic sensors such as
GPS, IMU and camera are equipped onboard. The navigation system has to minimally include
functional blocks for flight control, attitude estimation, vehicle and target state estimation, target
detection, motion planning, and sensor calibration.
Among the many requirements for an intelligent navigation system, this thesis is dedicated
to the research and development of the following fundamental functions.

Attitude estimation for flight control Reliable and realtime attitude estimation is essen-
tial for robust flight control during dynamic maneuvering. The regularity of parallel line segments
appearing in an urban scene is utilized to recover an absolute measurement of the vehicle atti-
tude. A Kalman filter framework to combine visual and inertial measurements will be described
in Chapter 3.

Sensor self-calbration The calibration of intrinsic and extrinsic parameters of a cam-


era/IMU system is mandatory before the sensor fusion. The use of natural references, such as
gravity and distant scene points, provides in-situ calibration capability whenever a sensor setting
changes in field. A linear or iterative calibration scheme based on the factorization method will
be presented in Chapter 4.
2.2. NAVIGATION SYSTEM REQUIREMENT 7

(Chapter 5)

Motion Planning

(Chapter 4)

Sensor Flight Control Airplane


Calibration

(Chapter 3)

Camera Attitude Estimation

Gate Detection
IMU
(Chapter 6)

Vehicle/Target
GPS
State Estimation

(Chapter 6)

Figure 2.2: Diagram of a vision-based navigation system for the air-slalom task

Motion planning of a fixed-wing UAV A vehicle trajectory that results in success when
approaching and entering a gate should be feasible for a fixed-wing airplane. It is also preferred
that the trajectory be easily executable and implementable. An efficient discrete motion planning
method that interconnects a vehicle’s trim states will be described in Chapter 5.

Integration and experiments The navigation system additionally integrates target detec-
tion and state estimation of vehicle and target postures for the air-slalom task. The description
of the hardware and software architecture of our UAV system and the results of the air-slalom
experiment will be presented in Chapter 6.
8 CHAPTER 2. AIR SLALOM TASK
Chapter 3

Visual/Inertial Attitude Estimation

Attitude is a fundamental vehicle state towards stabilizing an aircraft and achieving autonomous
navigation of UAVs. In this chapter, a fusion framework of visual and inertial sensors is presented
to provide reliable and robust attitude estimation during highly dynamic maneuvers.
A vision sensor has enabled natural scenes to act as an alternative source for attitude inference
in addition to other sensors such as IMU, GPS, magnetometer and thermopile. The merit of a
camera is that it is not only self-contained and passive, but also interactive with its environment.
Detection of a reliable reference of gravity in an image can recover absolute camera attitude with
a relatively constant level of accuracy. Horizon detection, for example, has been a prevalent
method for vision-based attitude estimation for mid-altitude missions. Figure 3.1(a) shows that
the horizon, equal to a straight line separating the sky and the ground regions, expresses the
attitude directly. However, this method is no longer applicable at low altitudes since the horizon
loses visibility due to the occlusion with surroundings, as shown in Figure 3.1(b)-(c).
Fortunately, another visual method can provide a way to infer attitude in images that por-
tray urban environments. Since man-made environments generally exhibit strong regularity in
structure, a camera will observe a number of line segments which are either parallel or orthogonal
to the gravity direction like Figure 3.2. These line segments in an image allow us to construct a
set of vanishing points at which each parallel line bundle intersects, and to describe the attitude
using an explicit geometric relationship between the vanishing points.
Using this advantage of urban scene regularity, we propose a driftless attitude estimator
that combines inertial gyro data with gravity-related line measurements in a Kalman filter; the
gyroscope traces instantaneous relative rotation at a high rate and the line measurements reset
its inevitable unbounded error accumulation. Note that the line segments become ready for the
attitude estimation in the Kalman filter once after they are correctly classified into either vertical
or horizontal edges according to vanishing points.
The scene of interest is not limited to a close view of urban structures such as the facade

9
10 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

Figure 3.1: Visual features available in urban scenes for attitude estimation at different altitudes: at a
low altitude, the horizon separating the sky and ground is occluded but gravity-related line segments in
man-made structures can be exploited for attitude estimation.

of a building or indoor scene. We also consider a distant view of dense structures like that in
Figure 3.1(b), which is a more likely setting for fixed-wing UAV operations. These distant images
are more challenging because an estimation method must interpret a sparse set of short and noisy
line segments and reject outliers effectively.
In the experiment section, our sensor fusion method is compared with other conventional
inertial-only and vision-only attitude estimation methods.

3.1 Related Work


Attitude estimation for small UAVs is typically a study of fusion methods that use complementary
low-cost sensors. Vast research has been conducted regarding fusion of IMU, GPS and magne-
tometer in a Kalman filter framework aimed at a full state of the vehicle including position and
orientation. Table 3.1 summarizes the pros and cons of sensor fusion methods when gyroscopes
are combined with other complementary sensors.
When only an IMU was available, a nonlinear complementary filter [4] and direction cosine
matrix [5] was developed for long-term driftless attitude estimation. After dynamic acceleration
was removed from accelerometers using simplified vehicle dynamic models, a gravitational direc-
tion was combined with integrated gyro angles. However, these simple models could not capture
aggressive maneuvers precisely. Infrared thermopile was used to detect the horizon from temper-
ature difference between the sky and ground, but its accuracy and resolution were limited [6].
In vision-only methods, when the horizon is visible with no occlusion, the attitude has been
derived from a straight line or a curved line in catadioptric images by separating the sky and
ground regions based on context differences [7, 8, 9, 10, 11]. Parallel line segments present in an
urban scene have been used for attitude estimation via vanishing point detection [12, 13]. A batch
process was developed for recovering the history of camera orientations using a bundle adjustment
of vanishing points [14]. Mutually orthogonal vanishing points were searched in a building scene
3.2. ATTITUDE FROM PARALLEL LINES 11

Table 3.1: Comparison between sensor fusion methods that combine gyroscopes with other complemen-
tary sensors for driftless attitude estimation

Fusion with gyro Reference Pros and Cons

+ Absolute orientation measure


Carrier phase, Gebre-Egziabher [17],
GPS − More noisy in shorter baseline
Antenna array Li et al. [18]
− Fixed integer ambiguity
+ Compact and cost effective
Static Easton et al. [4],
Accelerometer − Big short-term error
gravity Premerlani et al. [5]
− Dynamic gravity elimination
+ Compact and cost effective
Earth Metni et al. [19],
Magnetometer − Unknown angle around magnetic vector
magnetic field Eldredge [20]
− Magnetic interference
+ Absolute attitude measure
Ettinger et al. [7],
Camera Horizon − Occlusion at low altitude
Cornall et al. [9]
− Relatively intensive computation
+ Urban scene at low altitude
Gravity-related Demonceaux et al. [15],
Camera + Absolute attitude measure
line segments Hwangbo et al. [21]
− Line classification needed
− Intensive computation
(+ represents pros and − represents cons)

and a vertical vanishing point was found by sky detection [15] and it was possible to track these
points over a catadioptric image stream [16].
Our approach is also based on vanishing point detection but raw line measurements are directly
used in a Kalman filter without explicit use of detected vanishing points. In the fusion with the
gyroscope, a current best attitude estimate is repeatedly updated by each line measurement.

3.2 Attitude from Parallel Lines


In the perspective projection of a pinhole camera, parallel lines in a 3D scene merge at a single
point of the image called a vanishing point (VP). Multiple vanishing points may exist from vertical
and horizontal edges as shown in Figure 3.2. When a camera calibration matrix K is known,
a geometric relationship between camera orientation, the horizon, and vanishing points along
principle world coordinates has been well established in a Gaussian sphere representation [22].
One merit of the Gaussian sphere is that it is possible to describe all vanishing points, including
VPs at infinity, in homogeneous coordinates with no exceptional cases.
12 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

Suppose the camera is calibrated and line segments are expressed in normalized image coor-
dinates. In Figure 3.3, the Gaussian sphere G is a unit sphere located at the camera’s optical
center Oc . On a 2D projective space P 2 , a line segment `i ∈ R3×1 on the image is represented
as a normal vector of its great circle Ci in homogeneous coordinates. The intersection of two
parallel line segments, `i and `j , constructs a vanishing point, vij = `i × `j , located on the sphere.
Based on the duality in P 2 between points and lines, vij can be viewed as a direction to the
corresponding vanishing point at infinity. The vanishing points that lie on the same plane in a
scene define a vanishing line in the image.
In a view of man-made structures, we can arguably claim that all vertical edges merge at a
single vanishing point vv , which we call a vertical vanishing point. On the other hand, horizontal
edges may converge at multiple vanishing points, which we call horizontal vanishing points Vh =
(vh1 , . . . , vhm ). The vertical VP is aligned to the gravity direction zw in the world coordinate and
the horizontal VPs lie on the world ground plane W of which normal is the gravity direction.
Hence, the vanishing points, vv and Vh , are subject to the following geometric constraint:

vv> vhi = 0 i = 1, . . . , m. (3.1)

Note that horizontal VPs in Vh are not necessarily orthogonal to each other unless the scene is
strictly subject to the Manhattan world assumption like Figure 3.2. In real-world aerial scenes,
no constraints between horizontal VPs are given in this paper.

vv

1
horizon(h) vh
vh2

vh2

vh1
horizon(h)

vh2

vv

Figure 3.2: Horizon and vanishing points in a Manhattan world: several bundles of parallel lines converge
at either a single vertical VP (vv ) or multiple horizontal VPs (vh1 , vh2 ) on the image. The horizon h,
equivalent to camera attitude, is derived from either vv or a vanishing line connecting vh1 and vh2 .
3.2. ATTITUDE FROM PARALLEL LINES 13

parallel lines
image plane
`j G
vv (zw )
vhm
G
`i vij W Ãm
Ci
Oc Ã2 vh3
Cj vh1 (xw ) vh2

(a) Parallel line intersection (b) Gaussian sphere

Figure 3.3: Representation of vanishing points on a Gaussian sphere G: (a) Parallel lines `i and `j
intersect at a point vij . (b) Camera attitude is equivalent to the vertical VP (vv , antipodal point), or its
great circle W on which multiple horizontal VPs (vh1 , . . . , vhk ) lie.

The horizon h is a vanishing line connecting any two horizontal VPs as shown in Figure 3.2.
From the constraint (3.1), it can also be seen that the line h is dual to the point vv in P 2 .

h = vhi × vhj , vv = vhi × vhj , (3.2)

This duality implies that the horizon h is a projection of the world ground plane W and vv is a
projection of its plane normal zw in Figure 3.3.

The camera attitude is equivalent to either the horizon h or the vertical VP vv , because an
orientation of the world ground plane with respect to the camera is equal to that of the great circle
W in the Gaussian sphere. Given a relative camera pose to the airplane, an airplane attitude can
be uniquely determined when either vv or at least two vhi s are found in an image.

3.2.1 Vertical vanishing point

Suppose that a camera frame c is attached to an airplane frame b so that the camera’s principle
axis is aligned along the fuselage centerline in Figure 3.4. The orientation of an airplane has a
roll φ, pitch θ, and yaw angle ψ. Then, by definition, a perspective projection of the world z-axis
zcw in the camera coordinate with a normalized camera matrix P = [I 0] is equal to vv :
 
" # −sφ cθ
zcw
= Rcb Rbw zw
 
vv = P w = −cφ cθ 
  (3.3)
0
−sθ
14 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

Rbc
zb
xb zw
zc
Oc
Ob
xc

yw xw
Rw
b (Á; μ; Ã)

Ow

Figure 3.4: Coordinate transformation between the world frame to the camera frame.

where    
cθ cψ −cφ sψ +sφ sθ cψ sφ sψ +cφ sθ cψ 0 −1 0
Rw c
   
b = cθ sψ , =
 cφ cψ +sφ sθ sψ −sφ cψ +cφ sθ sψ  R 0 0 −1 . (3.4)
 b  
−sθ sφ cθ cφ cθ 1 0 0
Note that camera translation has no effect on the projection of a point at infinity such as a
vanishing point.
Let v∗ = (vx , vy )> denote a vanishing point v on an image which is described in inhomoge-
neous coordinates. As expected, the position of vv∗ is determined solely by the attitude (φ, θ)
regardless of the yaw angle ψ:
> >
xcw xcw
 
sφ cθ cφ cθ
vv∗ = c
, c = , . (3.5)
zw zw sθ sθ
In the opposite way, the attitude can be derived from vv∗ as follows:
1
φ = atan2 (vx , vy ), θ = atan q (3.6)
vx2 + vy2

The horizon h on the image, dual to vv , is a line described as

(sφ cθ )x + (cφ cθ )y + sθ = 0 (3.7)

3.2.2 Horizontal vanishing point

A horizontal VP is a point at infinity on the world ground plane W. Without loss of generality,
we can assume vh is at the direction of the world x-axis. Then the projection of xcw is stated as
" #
xcw
vh = P = Rcb Rbw xw
w
0
h i>
= cφ sψ −sφ sθ cψ , −(sφ sψ +cφ sθ cψ ), cθ cψ (3.8)
3.3. VISUAL/INERTIAL SENSOR FUSION 15

!b x = (Á; µ)
Gyroscope Integration

Vertical Line
Measurement
Calibrated VP Line
® = gv (x; mi )
Camera Detection Classification Attitude

Horizontal Line
Measurement

® = gh (x; vhk ; mi )

b µ)
Vertical VP prior v v = vv (Á; b

Figure 3.5: Sensor fusion diagram of integrated gyro angle and image line measurements in an extended
Kalman filter framework for driftless attitude estimation.

 >
cφ sψ − sφ sθ cψ −sφ sψ −cφ sθ cψ
vh∗ = , (3.9)
cθ cψ cθ cψ

All the horizontal VPs (v1h , . . . , vm


h ) in V lie on the horizon h and their locations are indicated
h
by yaw angles (ψ1 , . . . , ψm ), respectively, as shown Figure 3.3(b). In other words, the locus of vh
along ψ ∈ [−π, π] is equal to h.

3.3 Visual/Inertial Sensor Fusion

An extended Kalman filter (EKF) framework [23] is employed to combine inertial and visual
sensors for driftless attitude estimation in realtime. Figure 3.5 shows a diagram of the sensor
fusion in which a high-rate attitude prediction from a gyroscope is updated by low-rate line
measurements from a camera.
At the Kalman update, we choose to use line segments rather than other higher-level mea-
surements such as detected vanishing points or an image-based attitude recoverd by VPs. The
first reason is that noise characteristics of line measurements are closer to Gaussian than those
of other measurements, and thus more adequate to model a noise variance for the EKF. Edges
are lower-level and less-processed features of an image. Another reason is that vanishing points
are not needed if all line measurements are known to be vertical. Vertical edges are immediately
ready for the Kalman update. In the situation where vertical edges are dominant, no further
steps to estimate its VP are necessary.
16 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

b ∗ , g(·) = β. The angle β


Figure 3.6: Observation model of a line segment for a desired vanishing point v
is obtained from a virtual line connecting v∗ and the midpoint m.

3.3.1 Process model

Let x denote the attitude (φ, θ) and b denote a gyroscope bias. A total state in EKF is χ =
[x, b]> = [φ, θ, bx , by , bz ]> . A process model is comprised of the mechanization of a strap-down
tri-axial gyroscope and the propagation of a gyroscope bias error model as follows:
" #
1 sin φ tan θ cos φ tan θ
ẋ = (ω g − b) + nx (3.10)
0 cos φ − sin φ
ḃ = nb (3.11)

where ω g is a gyroscope output, nx and nb are Gaussian process noise and gyro bias error noise,
respectively. Errors in the gyroscope’s scale and bias are modeled as a single time-varying bias
term b in (3.11) and the Coriolis effect is neglected in (3.10). The Jacobian F in the state
transition is derived in Appendix-A.1.

3.3.2 Line measurement update

Suppose that line segments are already classified according to vanishing points. From a classifi-
cation step that will be described in Section 3.4, a set of line segments L = (`1 , . . . , `n ) is grouped
into (m + 1) line groups with corresponding vanishing points H = (vv , vh1 , . . . , vhm ), which we call
a horizon measurement. Each line segment `i ∈ L is now represented as

`i = (αi , mi , si ; v ∈ H) for i = 1, . . . , n (3.12)

where the parameters are its midpoint mi , slope αi , length si and associated VP v in H.
A line observation model g(·) captures how a line segment `i should be aligned when a desired
b ∗ is given on an image in Figure 3.6. This model is represented as the angle βi
vanishing point v
3.3. VISUAL/INERTIAL SENSOR FUSION 17

b ∗ and the midpoint mi . The way to set a desired vanishing point v


of a virtual line connecting v b∗
for the line `i is different depending on whether its associated line group is vertical or horizontal.
b ∗ is a function of the attitude x only. If the line is a horizontal
If the line is a vertical edge, v
b ∗ is a function of the attitude x and its associated vanishing point vhk in H.
edge, v

vby∗ − my
 
−1
βi = tan (3.13)
vbx∗ − mx

gv (x, mi ) b ∗ = vv∗ (x) in (3.5)
if `i is vertical, v
=
g (x, vk , m )
h h i b ∗ = vh∗ (x, ψk ) in (3.9)
if `i is horizontal, v

Since generally vhk is not on the ground plane W constructed by the attitude x in the Gaussian
b ∗ for the horizontal edge is obtained from the projection of vhk on W as
sphere, the desired v
b ∗ on an image plane is a intersection point between the
seen in Figure 3.7(a). In other words, v
following two lines: the horizon h equivalent to the attitude x and the line connecting vhk and
the camera center in Figure 3.7(b). Then, ψk is computed as
q
ψi = tan−1 cos θ (v12 + v22 )/v32 − tan2 θ

(3.14)

when vhk = (v1 , v2 , v3 ) and x = (φ, θ).

The measurement of a line slope αi is assumed to be perturbed by a Gaussian noise n` :

αi = βi + n` for i = 1, . . . , n. (3.15)

The variance of n` is set to be inversely proportional to the line length, n` = λs−1


i , since the
angle of a longer line segment is less likely to be perturbed in an image processing step.
The Jacobian J in the EKF update is a partial derivative of βi with respect to x and b. Since
the bias b is not involved in the update, two different Jacobians with respect to x are obtained
for vertical and horizontal line measurements, respectively, as follows:
 2
1  r 2 nx

∂gv −r + rny
Jv = = , (3.16)
∂x d2 c2θ d2
 2 
∂gh −r + (tψ /cθ )nx − tθ ny nx + tψ (−cθ + sθ ny )
Jh = = , (3.17)
∂x d2 (cθ d)2

where r, d, nx and ny are defined in Figure 3.6. See Appendix-A.2 for the derivations of Jv and
Jh in details.
The innovation in the Kalman update is a deviation angle e. For all the line segments in
L, the measurement update is sequentially repeated with different noise variance n` , which is
inversely weighted by the line length s.
18 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

G
zw
vhk

W b
v
Ãk

xw

(a) On a Gaussian sphere (b) On an image plane

b∗
Figure 3.7: Determination of a desired vanishing point v in the EKF update for horizontal line mea-
surements associated with vhk . The v v∗ on the image) is a projection of vhk on the the world ground
b (b
plane W (h on the image) which is equal to a current attitude estimate x.

3.4 Line Classification


This section describes how to detect vanishing points and classify line segments into vertical or
horizontal edges. The classified line measurements will proceed to the EKF update in Section 3.3.
We present a hierarchical clustering method which recursively splits the lines into vertical or
horizontal groups in a greedy manner. At each level of hierarchy, a RANSAC-based vanishing
point detection is performed for line grouping and outlier rejection with various accuracy.
In the situation of continuous attitude estimation using an IMU and image stream, one main
feature of the proposed method is to use a current attitude estimate as the prior for a vertical
vanishing point in order to achieve fast and reliable clustering. Since natural urban images
may contain outliers that are neither aligned nor orthogonal to the gravitational direction of the
earth, this prior and the VP constraint (3.1) can effectively exclude invalid candidates for a true
vanishing point in the RANSAC-based method. Moreover, the hierarchical clustering splits the
lines based on this prior at the top level.

3.4.1 RANSAC-based vanishing point detection

Suppose n line segments L = (`1 , . . . , `n ) are given in the presence of outliers. Two additional
inputs are provided for an efficient VP hypothesis generation; a coarse estimate of the vertical
VP and the type of a vanishing point we seek.
Algorithm 3.1 describes a hypothesis-and-test scheme in RANSAC [24] for the detection of
a single major vanishing point. Two lines (`i , `j ) are randomly sampled to produce one VP
hypothesis, vc = `i × `j . This candidate is tested for all the lines in L to count the number of
inliers. An inlier set with respect to vc is determined by whether or not the deviation angle e
is less than an error threshold eth (see Figure 3.6). After a preset number of trials are repeated,
the best VP is selected as one that has a maximum number of inliers. Similar approaches can be
3.4. LINE CLASSIFICATION 19

Algorithm 3.1: RANSAC for detecting a major vanishing point


(vbest , Ibest ) = RANSAC VP(L, eth , v, vp type)
L: line segment data (α, m, s) in (3.12), eth : error threshold
v: vertical VP prior, vp type: vertical or horizontal
begin
Set Ntrial , εmax , nmax = 0
for trial = 1 to Ntrial do
vc = `i × `j ← (i, j) = random pair()
vc ← vc /kvc k
if vp type = vertical then ε = cos−1 (vc> v)
else ε = π/2 − cos−1 (vc> v)
if |ε| < εmax then
e = align error (vc , L);
I = find (e < eth );
if |I| > nmax then
nmax = |I|
Ibest = I, vbest = vc

(Note: I is a line index list, |I| is the number of the index list.)

found in literature [25, 26, 27].


For efficient hypothesis generation using the prior v, we discard implausible candidates whose
distance from v are greater than εmax . The distance ε between vc and v is defined differently
depending on the given VP type. For a vertical VP, the distance is the angle between vc and v,
ε = cos−1 (vc> v). For a horizontal VP, the distance is the angle between vc and the plane normal
to v, ε = π/2−cos−1 (vc> v). This simple rejection rule significantly reduces invalid VP candidates
that are highly likely to be generated by outliers or a combination of horizontal and vertical edges.
Note that vc and v are normalized in homogeneous coordinates so that kvc k = kvk = 1.

3.4.2 Hierarchical line grouping

We use a hierarchical clustering method that partitions the lines L into vertical and horizontal
groups in a greedy manner based on the vertical VP prior (v). At the top level, this prior v makes
a coarse division of L into two non-overlapping groups: an uppermost vertical and a horizontal
line group in which corresponding VP types (vertical or horizontal) are labeled but their vanishing
points are not searched. The accuracy of v is not essential for this division, but generally allows
up to 20-degree error from a true vertical VP in a Gaussian sphere.
At lower levels, each group is divided into multiple subgroups with VP detection. The
20 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

Algorithm 3.2: Hierarchical line clustering with multiple vanishing point detection
G = Hierarchical Clustering(L, v)
L: line segment data (m, α, s), v: vertical VP prior
begin
Set eth , n`,min and αmax . G = ∅, I = U(L), level = 0;
Iv = find(align error (v, L) < eth );
G = add line group(∅, Iv , vertical, level);
G = add line group(∅, I − Iv , horizontal, level);
for level = 1 to Lmax do
foreach g in Glevel do
I ← I(g), vp type = vp type(g);
repeat
(vbest , Ibest ) = RANSAC VP(L(I), v, eth /2level , vp type);
Ibest = find(align error (vbest , L) < eth /2level );
(I1 , I2 ) = K means clustering (α, Ibest , k = 2);
if |mean(α(I1 )) − mean(α(I2 ))| > αmax then
Ibest = I1 ;

vbest = SVD weighted (L, Ibest );


G = add line group(vbest , Ibest , type, level);
I ← (I − Ibest );
until |I| < n`,min

foreach g in G do
remove g from G if g has a child or |I(g)| < n`,min .

forall the group pair (gi , gk ) in G do


merge if cos−1 (v(gi )> v(gk )) < dmin .

RANSAC in Algorithm 3.1 is repeated until remaining lines in the group are less than a threshold
n`,min . As a level moves further down, the inlier threshold eth and the minimum group size n`,min
become smaller in order to achieve a higher-resolution line grouping. The VP type of each group
is inherited from its parent group and is thus the same as that of its upper-most group. Therefore,
every subgroup tends to be composed exclusively of either vertical or horizontal edges depending
on the type of its uppermost group. The RANSAC also tries to find a vanishing point using v
and the given VP type.
Algorithm 3.2 shows a procedure that builds a hierarchy of line clusters while the splits are
performed recursively by the RANSAC as moving down the hierarchy. The following additional
steps take place after the RANSAC at each group division.
3.4. LINE CLASSIFICATION 21

90

Line Angle (deg)


60

30

−30

−60

−90
0 10 20 30 Line Index

(a) (b) (c)

Figure 3.8: Effect of K-means clustering (k = 2) in Algorithm 3.2. VPs are marked by green. (a) Inlier
size in the RANSAC is maximized by including outliers (horizontal edges) when eth = 2.5 deg. (b) K-
means clustering with respect to α shows that the mean difference is larger than the maximum line angle
divergence αmax = 60 deg. (c) The smaller group (horizontal edges) in K-means clustering is removed
when αmax is violated.

• Group membership Ibest is recomputed for the entire line segment with respect to the vbest
returned by RANSAC. This allows groups to overlap and provides a chance to include the
members missed from its parent node.
• The VP representing the group (vbest ) is refined as a least squares solution of Av = 0 where
each row of A is equal to Ai = si `>
i , i ∈ Ibest in SVD weighted().

• As exemplified in Figure 3.8, K-means clustering with respect to line angle α is intended to
cope with the case that the number of inliers is maximized via including both vertical and
horizontal edges. To remedy this case, we assume that a bundle of lines sharing a VP has
a bounded range of line angle (αmax ) when a camera’s field-of-view is limited. Hence, the
smaller-size group is tested for the possibility to merge with other groups only if the mean
difference between two partitions of line angle α is larger than αmax .

Once the hierarchical clustering is complete, we first choose all the bottom-level nodes that
have no children, and remove small-sized nodes among them since their VPs are supported by
insufficient line members. Because a vertical line group should exist uniquely, all the groups
labeled as vertical are merged and filtered by the K-means clustering. The horizontal line groups
are also merged if the distances between corresponding VPs are closer than a threshold dmin ,
i.e., cos−1 (vhi> vhj ) < dmin . Eventually, the lines are clustered into a single vertical and multiple
horizontal line groups, G = (gv , gh1 , . . . , ghm ).

Some tree diagram examples and their select line groups are shown in Figure 3.9 and 3.10 for
simulated images and real aerial images, respectively.
22 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

3.5 Horizon Measurement


The vanishing points in G (vbest in each line group) that resulted from the line clustering do not
necessarily satisfy the VP constraint (3.1) even though this constraint has been used to reject
invalid VP candidates. Hence, we seek a set of vanishing points H = (vv , vh1 , . . . , vhm ) that are
strictly subject to vv> vhi = 0 for i = 1, . . . , m, and we call H a horizon measurement. On an
image plane, H is represented by a horizon line h. This h is equivalent to vv in H and is also
the line on which all the horizontal VPs (vh1 , . . . , vhm ) lie.
To find the Hbest that best describes the classified line groups G, we propose another RANSAC-
based hypothesis-and-test scheme that efficiently generates horizon measurement hypotheses from
G in a principled way. When a camera is the only available sensor, this scheme serves as a vision-
only estimation method that uses line measurements only and yields the best attitude estimate
from vv in Hbest .

3.5.1 Horizon hypothesis generation

Suppose a set of line groups G = (gv , gh1 , . . . , ghm ) are given by the line clustering method in
Section 3.4. There are following three ways to produce a horizon candidate hc through sampling
classified line segments in G:

• Two vertical edges: hc = `i × `j , (i, j) ∈ gv .

• One vertical edge and one horizontal vanishing point: hc = `i × vhk , i ∈ gv .

• Two horizontal vanishing points: hc = vhi × vhj .

The first case produces a vertical VP from the intersection of two vertical edges. The second
case arises from the fact that a vertical line `i is dual to a point on the horizon, and thus two
points, `i and vhk , lie on the horizon hc . The last case provides the capability to derive vv even
when no vertical line group exists in G (e.g. a camera is looking down toward the ground). Note
that a horizontal VP is also randomly sampled from two horizontal edges in the same group,
vhk = `i × `j , (i, j) ∈ ghk .
A horizon measurement candidate Hc is composed of a vertical VP that is equal to hc , and
horizontal VPs that are generated by the intersections between hc and line segments randomly
sampled from each horizontal line group, i.e.,

Hc = (vv , vh1 , . . . , vhm ) s.t. vv = hc , vhk = hc × `i , i ∈ random(ghk ), for k = 1, . . . , m (3.18)

which satisfies the VP constraint (3.1). The number of candidates for H becomes easily intractable
due to combinatorial explosion when the three cases for hc and the random line sampling for vhk
are considered. To create a feasible number of hypotheses, we set the maximum samples of a
single line to nv,max and nh,max , and line pairs to nvv,max and nhh,max , respectively.
3.5. HORIZON MEASUREMENT 23

In Algorithm 3.3, the horizontal VP candidates (vh1 , . . . , vhm ) in Hc are generated and tested
at the same time in the hypothesis test function for the computational efficiency described in the
following section.

3.5.2 Horizon hypothesis test

A set of horizon measurement candidates H∗c is evaluated based on the following score function:
m m
" #
X X X X
k k
Score(Hc ; G) = Cv + Ch = f (vv , `j ) + f (vh , `j ) (3.19)
k=1 `j ∈gv k=1 k
`j ∈gh
  
 sj 1 − e(v i , `j )
: if e < eth

f (vi , `j ) = eth (3.20)
0

: otherwise

where e is an alignment angle error of the line `j to vi and sj is a line length of `j as shown in
Figure 3.6. A similar function is used in Rother’s work [25]. The line length in the score reflects
that a true VP is more closely positioned in the direction of the longer line segment, and the
score is irrelevant to outliers whose alignment error is larger than eth .
The score function for each vi in Hc accounts only for the corresponding line group gi in
G, but ignores the other groups completely. This is because the line segments are assumed to
already be properly clustered into G and eth rejects outliers in the score, if any. Therefore, as
k
implemented in Algorithm 3.3, the maximum score per horizontal group (Ch,best ) can be evaluated
simultaneously with the generation of vhk candidate in Hc .
Once all the hypotheses are tested, the hypothesis with the maximum score is selected as
Hbest which best explains both vertical and horizontal line measurements in terms of the score
function. In Figure 3.10, the top five horizon candidates are listed in each image.

Vision-only vs. visual-inertial method:


The procedure to find Hbest can be considered as a vision-only method, and vv in Hbest is an
optimal solution for the attitude estimation that uses line segments only.
In the visual-inertial method, Hbest and G are inputs to the Kalman update in Section 3.3.2.
Compared with vbest found in the line clustering in Section 3.4, Hbest provides a more precise yaw
angle ψk of each horizontal vanishing point vhk at the Kalman update in (3.14). From the benefits
of the sensor fusion, the advantages of the visual-inertial method include a higher estimation rate,
smaller estimation error, and the ability to cope with occasional textureless images. Performance
comparison between the vision-only and visual-inertial method will be presented in Section 3.7.
24 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

ID=1 (n=87)

Line groups G

ID=2 (n=36) ID=3 (n=51)

*
ID=4 (n=28) ID=5 (n=5) ID=6 (n=34) ID=7 (n=17)
H
best
v1
h

v2h

ID=8 (n=23)

v
v

ID=1 (n=89)

Line groups G

ID=2 (n=44) ID=3 (n=45)

ID=4 (n=40) ID=5 (n=2) ID=6 (n=24) ID=7 (n=21)


H*best

v2h

v1h

ID=8 (n=37)

vv

Figure 3.9: Hierarchical line clustering results on two simulated Manhattan-world images: (left) hierarchy
of line clusters, (right top) classified line groups G, (right bottom) Hbest on Gaussian sphere.
3.5. HORIZON MEASUREMENT 25

ID=1 (n=68)
Line groups G
1:1.669
2:1.669
3:1.671
4:1.677
5:1.680

ID=2 (n=10) ID=3 (n=58)

ID=4 (n=10) ID=5 (n=31) ID=6 (n=14) ID=7 (n=13) H*best


v
v 1
vh

3
vh
ID=8 (n=8) ID=9 (n=27) ID=10 (n=12) ID=11 (n=6)
2
vh

ID=1 (n=59)
Line groups G
1:1.035
2:1.065
3:1.177
4:1.190
5:1.209

ID=2 (n=10) ID=3 (n=49)

ID=4 (n=9) ID=5 (n=22) ID=6 (n=18) ID=7 (n=9) H*best


vv

v2h

v1h
ID=8 (n=20) ID=9 (n=14)

Figure 3.10: Hierarchical line clustering results on two aerial urban images: (left) hierarchy of line
clusters, (right top) classified line groups G and top-five horizon candidates H, (right bottom) Hbest on
Gaussian sphere.
26 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

Algorithm 3.3: Horizon hypothesis generation and test for vision-only attitude estimation
{h1c , . . . , hpc max } = Horizon Hypothesis Generation(G, L)
begin
Set nv,max , nvv,max , and nhh,max , p = 0;
foreach vertical line group gv in G do
for u = 1 to nvv,max do
hpc = `i × `j ← (i, j) = random pair(gv ); p = p + 1;

foreach horizontal line group gh in G do


for u = 1 to nhh,max do
vh = `i × `j ← (i, j) = random pair(gh );
foreach vertical line group gv in G do
for v = 1 to nv,max do
hpc = vh × `p ← p = random(gv ); p = p + 1;

foreach horizontal line group pair (gh1 and gh2 ) in G do


for u = 1 to nhh,max do
vh1 = `i × `j ← (i, j) = random pair(gh1 );
for v = 1 to nhh,max do
vh2 = `i × `j ← (i, j) = random pair(gh2 );
hpc = vh1 × vh2 , p = p + 1;

Hbest = Horizon Hypothesis Test({h1c , . . . , hpc max }, G, L)


begin
nh = number of horizontal line groups in G
for p = 1 to pmax do
vvp = hpc
Cv = i Score(vvj , `i ∈ gv )
P

for k = 1 to nh do
k
Ch,best = 0;
for u = 1 to nh,max do
vh = vvp × `i ← i = random(ghk );
Chk = i Score(vh , `i ∈ ghk )
P
k
if Ch > Ch,best then
k
Ch,best = Ch
k
vh,best = vh

Hp = (vvk , vh,best
1 k
, . . . , vh,best )
P k
Cp = Cv + k Ch,best

Hbest = argmax(Cp );
3.6. PERFORMANCE EVALUATION IN GRAPHICAL SIMULATOR 27

3.6 Performance Evaluation in Graphical Simulator

The simulation evaluation was performed for the visual/inertial attitude estimation using a graph-
ical flight simulator. This evaluation is focused on the precision and accuracy of our estimation
method when an UAV continuously conducts highly dynamic maneuvers in an urban setting.
The airplane was driven manually by a computer-connected RC controller to generate ag-
gressive motions resulting in a large swing of pitch and roll angles up to ±120 and ±60 degrees,
respectively. The output of a tri-axial gyroscope was sampled at 100Hz. A large gyroscope noise
(0.05 rad/s) was intentionally imposed in order to demonstrate the effectiveness of visual attitude
sensing in short time intervals (the noise of a MEMS-based gyroscope is typically 0.01 rad/s).
A vision-processing pipeline including edge detection and line segments fitting was identical to
that in a real experiment, except that rendered images were captured. Figure 3.11 shows some
samples of rendered images on which three line groups (gv , gh1 , gh2 ) and the true horizon h are
overlaid.

Estimation performance:

Figure 3.12 shows the estimation performance of the visual/inertial estimation method over
a 30-second flight. Note that, every 10 seconds, the EKF restarts with an initial 10-degree
attitude error and 0.1 rad/s gyroscope bias error. This is intended to investigate how fast the
visual Kalman update responds to an initial error. The noises in the EKF are set to [n> >
x nb ] =
[0.01 0.01 0.02 0.02 0.02]> rad/s, and n` = 10/si for a line measurement `i .
The initial attitude error almost diminishes less than one second corresponding to 30 Kalman
updates. The bias error takes about 3 to 5 seconds to reduce to nearly zero. Once the initial
error diminishes, estimation errors are always bounded within 3 degrees for both roll and pitch
angles and their variances are also less than 2 degrees on average. The higher estimate variance
in pitch than in roll demonstrates that the pitch is more sensitive to noises of line measurements
and vanishing point location. This is also explained by a high sensitivity of the Jacobian Jv in
(3.16) for a large pitch angle.
The average number of line segments per image is about 40. The sizes of the vertical and first
two horizontal line groups in Figure 3.12(f) show that, in most cases, H consists of both vv and
vh s from a result of the hierarchical line clustering.

Fully and partially observable measurements:

A set of line segments holds information about the attitude that can be abstracted from
their vanishing points. A horizon measurement H holds the following three different levels of
information about the attitude depending on its components:
• Redundant : Hv+h = (vv , vh1 , . . . , vhm )
28 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

t=0.03s t=1.23s t=4.23s t=6.51s t=8.31s

t=9.15s t=10.71s t=11.85s t=13.17s t=16.77s

Figure 3.11: (Simulated images) A Manhattan world is rendered in our graphical simulator. The ground-
truth horizon is a dotted black line. The lines associated with vv are in red and the other lines associated
with two vh s are in green and blue, respectively.

• Full : Hv = vv or Hh+h = (vh1 , vh2 )

• Partial : Hh = vh1
In the redundant case Hv+h , the attitude is overdetermined by multiple sources. For example,
the attitude can be derived from either vv or (vh1 , . . . , vhm ). In the full cases, Hv and Hh+h , the
attitude is uniquely determined. In the partial case Hh , a single horizontal VP is not sufficient to
determine the attitude. Nonetheless, Hh is still valid for the EKF update because Kalman filter
enables the state to be estimated fully even from partial observation of the state.
In Table 3.2, attitude estimation performance is compared when partial, full or redundant
information of the horizon measurement H are respectively used in the Kalman update. The use
of line measurements is intentionally restricted to each case of H, and thus all the irrelevant line
measurements are discarded. For example, Hv uses only vertical line segments at the Kalman
update. The comparison shows that no significant difference exists between the redundant and
full cases in Table 3.2(c)-(e). In the partial case, Hh , where only the first vh is used in the Kalman
update, no profound degradation of accuracy occurs while the estimation variance increases due
to a small number of line measurements.
3.6. PERFORMANCE EVALUATION IN GRAPHICAL SIMULATOR 29

Table 3.2: Estimation performance comparison when partial, full or redundant information of the horizon
measurement H is respectively used at the Kalman update in the simulation (errors are in degrees)

Measurement φerr θerr Pφφ (◦ ) Pθθ (◦ )


(a) gyro only 27.28 23.71 – –
(b) gyro + Hh 2.92 2.54 2.61 4.48
(c) gyro + Hh+h 1.56 1.53 0.81 1.35
(d) gyro + Hv 1.10 0.96 0.75 1.07
(e) gyro + Hv+h 0.90 0.93 0.49 0.73

Pφφ and Pθθ represent the means of EKF pitch and roll error variances.

Table 3.3: Computation time to produce classified vertical and horizontal line measurements and the
best horizon measurement Hbest in the real aerial experiment (tested for 320 × 240 images with a C
implementation in Intel Core CPU 1.73 GHz).

Time (msec)

Processing step Average Deviation Maximum

Canny edge detection 17.0 4.0 28.4


Recursive straight line fitting 1.7 0.7 5.5
Hierarchical line clustering 4.5 2.3 9.6
Horizon measurement 2.6 3.0 12.9
Total 25.8 7.5 56.4
30 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

[deg] (a) Roll Angle ground−truth roll


120
60
0
−60
−120
0 5 10 15 20 25 time[sec]

[deg] (b) Pitch Angle ground−truth pitch


80
40
0
−40
−80
0 5 10 15 20 25

[deg/s] (c) Gyroscope Bias Error b1 b2 b3


12
6
0
−6
−12
0 5 10 15 20 25

[deg] (d) Attitude Error


20 roll err pitch err

10
0
−10
−20
0 5 10 15 20 25
[deg]
(e) Attitude Covariance
3 roll cov pitch cov

0
0 5 10 15 20 25

(f) The number of lines


100
vv v1 v2
h h

50

0
0 5 10 15 20 25

Figure 3.12: Simulation evaluation of the visual-inertial attitude estimation (gyro + Hv+h ): a small
fixed-wing aircraft conducts highly dynamic maneuvers. Every 10 seconds, the Kalman filter restarts with
an initial 10-degree attitude error and 0.1 rad/s gyroscope bias error.
3.7. EXPERIMENT RESULTS 31

3.7 Experiment Results

Firstly, the Canny edge detector was applied to raw images, followed by non-maximal suppression
in order to obtain the edges of one-pixel thickness [28]. Prior to line fitting, these edge pixels were
rectified and normalized using a camera calibration matrix K and distortion factors. Finally,
each edge chain was recursively divided and fitted to line segments with sub-pixel accuracy based
on the thresholds of a line fitting error and minimum line length (we use one pixel error and 8%
of the image size, respectively).

3.7.1 Outdoor Urban Scene

An outdoor flight experiment was conducted over a football field surrounded by low-profile build-
ings and a stand. Some examples of our urban scene are shown in Figure 3.15. Building edges
and lane markers on the football field were detected as main line features. Both the camera and
IMU were calibrated in advance to obtain their intrinsic parameters and relative orientation. The
ground-truth attitude of the experiment was obtained by a manual line classification of inlier line
segments and finding a horizon that best explains these lines using (3.19). The boundary between
the sky and ground and the perspective view of buildings enabled us to draw an implied true
horizon line.
Figure 3.14 shows the estimation performance on a 60-second real aerial scene when the UAV
circled around a football field and formed a figure eight. The average attitude errors were 2.63
and 2.24 degrees, and their maximum errors are limited to 8.56 and 5.39 degrees for the roll
and pitch angle, respectively. The attitude error variance was always below 1.5 degrees. The
gyroscope bias is transient at the beginning due to the initial attitude error, but became steady
after 20 seconds. Table 3.3 shows the average and worst computation times spent on individual
steps from a raw image until classified line measurements and the best horizon measurement were
produced. Almost half the time was spent on the edge detection and the average total time 25.8
msec guaranteed the realtime processing at a video frame rate.
The sizes of classified line groups are shown in Figure 3.14(f). The line segments in each
group were sorted by length and the maximum size of each line group was set to 20 discarding
the shorter lines. On average, 50 line segments were found and the line length was 35 pixels in
320 ×240 images. Compared with the simulation, the experiment had a much smaller number
of vertical edges, because the UAV flew high over the buildings and there were no tall buildings
nearby. When only horizontal line groups were available, we observed that the attitude error and
its variance increased since the Kalman update depends on the accuracy of the vanishing point
detection.
Figure 3.13 demonstrates that a single horizontal VP in the EKF is sufficient to restrain the
32 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

25
Gyro−only.
20 Gyro+Hh
[deg] Gyro+Hv
15 Gyro+Hv+h

10

0
φerr θerr max|φerr| max|θerr| P P
φφ θθ

Figure 3.13: Estimation performance comparison between partial, full and redundant information of the
horizon measurement H in the experiment (errors are in degrees).

attitude from gyroscope’s unbounded drift (gyro + Hh ). Additional line measurements in Hv+h
slightly reduce the attitude error and its variance. Figure 3.15 shows several select examples
of the experimental result. The classified line segments in the middle column are inputs for the
Kalman update. In the last column, the pole axis and great circle of a Gaussian sphere represents
the EKF attitude estimate prior to the update step, and the points represent VPs in H obtained
from the hierarchical line clustering. They are plotted together to illustrate the deviation between
line measurements and attitude estimate at the prediction step. Their discrepancy indicates an
innovation error in the EKF and corrects the gyroscope drift.

3.7.2 Comparison with inertial-only and vision-only methods

The vision-only method refers to the best horizon measurement Hbest in Section 3.5. The inertial-
only method tested in the paper is a nonlinear complementary filter that combines the direction
of an estimated gravity ag , which elimnates centrifugal acceleration from accelerometer ouput
aimu using vehicle speed v and angular velocity ω [4, 5]. Briefly speaking, a long-term drift is
corrected by the following PI feedback of attitude error:

ag = aimu − ω × [v 0 0]> (3.21)


e = Rz × ag Z (3.22)
ω = ω imu + Kp e + Ki e (3.23)

In Figure 3.16 and 3.17, the sensor fusion in the visual-inertial method provides a better
estimation accuracy and smaller maximum error than those of the single-sensor methods. The
vision-only method has error peaks when line measurements are erroneous and insufficient to infer
a correct horizon. The inertial-only method has relatively large errors when the airplane undergoes
highly dynamic motions because the dynamic acceleration becomes dominant in accelerometer
measurements and difficult to correctly be removed.
3.7. EXPERIMENT RESULTS 33

[deg] (a) Roll angle ground−truth roll


60
30
0
−30
−60
0 10 20 30 40 50

[deg] (b) Pitch angle ground−truth pitch


60
30 time[sec]
0
−30
−60
0 10 20 30 40 50

[deg/s] (c) Gyroscope bias b1 b2 b3


4
2
0
−2
−4
0 10 20 30 40 50

[deg] (d) Attitude error roll err pitch err


20
10
0
−10
−20
0 10 20 30 40 50
[deg]
(e) Attitude covariance roll cov pitch cov
3

0
0 10 20 30 40 50

(f) The number of lines vv v1h v2h


60

30

0
0 10 20 30 40 50

Figure 3.14: The performance of the visual-inertial attitude estimation method (gyro + Hv+h ) on the
urban aerial scene in the experiment. The ground-truth on attitudes is given by manual fit of the horizon
line on the images.
34 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

t=−0.52s x=(−17.0,−8.5)

t=10.59s x=(−40.5,13.0)

t=12.69s x=(−31.0,9.5)

t=13.00s x=(−24.0,8.5)

t=13.66s x=(−8.5,5.0)

t=15.96s x=(16.0,9.0)

Figure 3.15: (Left) Aerial urban images over a football field and buildings. (Middle) Classified line
segments after outlier rejection. The vertical line group is in red and horizontal line groups are in green
or blue. (Right) The pole axis and great circle of the Gaussian sphere are the attitude estimate prior
to the Kalman update. The points are the vanishing points in H estimated during the hierarchical line
clustering.
3.7. EXPERIMENT RESULTS 35

20
Inertial−only(I)
[deg]

Vision−only(V)
15 Visual−inertial(V+I)

10

0
φerr θerr max|φerr| max|θerr|

Figure 3.16: Comparison of average and maximum attitude estimation errors between the vision-only,
inertial-only, and visual-inertial methods in the outdoor experiment.

[deg] Roll Error


20
V I V+I
10

−10

−20
0 10 20 30 40 50

[deg] Pitch Error


20
V I V+I
10

−10

−20
0 10 20 30 40 50

Figure 3.17: Attitude error plots of the vision-only, inertial-only, and visual-inertial attitude estimation
methods in the outdoor experiment during 60 seconds.
36 CHAPTER 3. VISUAL/INERTIAL ATTITUDE ESTIMATION

These comparisons clearly demonstrate the value and power of sensor fusion. In low-cost
small UAVs, it is hard to expect that any single sensor will be superior to other sensors in all the
aspects of performance and working condition. For instance, high-precision gyroscope angles for
a long period of time require enormous effort and time in sensor calibration and compensation
of various noise effects like a stocastic bias fluctuation and earth rotation. The sensor fusion we
propose makes it possible to effectively circumvent these highly scrutinized processes.
Chapter 4

Visual/Inertial Sensor Calibration

For the fusion of visual and inertial sensors, calibration is an indispensable step to integrate the
measurements in a common frame of reference. In this chapter, an easy-to-use camera/IMU cal-
ibration procedure is presented for in-field automatic calibration, which uses gravity and natural
landmarks as reference measurements.
Our interest is focused on self-calibration which refers to when no artificial calibration object
or user interaction is required. Moreover, rather than calibrating each sensor independently, we
examine a method to calibrate both sensors simultaneously. The calibration parameter consists
of IMU shape and bias (S, b), camera matrix K, and relative orientation Ric between the sensors.
Relative translation tic is excluded here, but can be estimated afterward, when necessary, once
the other parameters are found.
The calibration procedure we propose begins with the factorization-based calibration, which
recovers an IMU’s intrinsic shape from a set of constraints regarding motion magnitude. The fac-
torization method, which originated from shape-and-motion recovery in computer vision, exploits
the fact that IMU measurement is the product of intrinsic shape parameters (scale, alignment
and bias) and exerted motion (accelerations or angular velocities). We used the following two
natural references for the self-calibration. Not only magnitudes of exerted loads on the IMU are
known, but also abundant calibration dataset is readily collectable.
Gravity is a natural feature that is commonly involved with accelerometers and cameras.
The magnitude of gravity is precisely known in a static condition. For accelerometer calibration,
placing an IMU at various arbitrary attitudes produces a large number of calibration datasets
from gravity loads in different directions. For a camera, the projection of gravity direction is
equal to a vanishing point at which vertical edges in an urban image merges. Since the IMU
calibration recovers gravity directions in terms of IMU coordinates, a camera calibration matrix
is obtainable from the known Euclidean angles between vanishing points.
The feature track of a distant unknown scene point over images serves as a natural landmark

37
38 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

for both gyroscope and camera calibration. An infinite homography constructed from feature
tracks not only contains the information about the magnitude of angular velocity but also provides
constraints for a rotating camera calibration.
Compared with other previous methods, the main distinctive features of the proposed cam-
era/IMU calibration are as follows:
• A self-calibration procedure using natural references such as gravity, vertical edge, and
distant scene points in an urban environment
• No assumption on the configuration or number of an IMU’s inertial components

• A linear solution for a redundant IMU

• An iterative linear solution for a triad IMU with an initial guess only for bias

Mathematical details on the factorization-based IMU calibration are described from Section 4.2
to 4.4 and the collaborative camera/IMU calibration procedure is described in Section 4.6.

4.1 Related Work

Traditionally, the calibration spends expertise, labor, time and external equipment to provide true
motions for IMU calibration [29, 30, 31, 32, 33, 34] and known 3D feature points for camera/IMU
calibration [35, 36, 37, 38, 39]. For example, precise sensor orientation in a static condition
has been realized by a robotic manipulator [29] or a special mechanical platform [30]. IMU’s
navigational state (position, velocity and orientation) has also been compared with external high-
precision devices like an optical tracking system [33, 34] or a speed-controlled turning table [40].
One prevalent approach for the IMU is self-calibration [41, 42, 43, 44]. For instance, a collec-
tion of partially known motions constitutes sufficient constraints on a parameter search. From
highly abundant loads of known magnitude, a nonlinear least squares solution [42] and a sophis-
ticated iterative method for an enlarged convergence region [44] were proposed. Their drawbacks
are that a shape model is not scalable but is instead limited to a tri-axial configuration, and
initial guess is also required for all of the parameters. As an alternative, a new perspective on
self-calibration was explored by the factorization method [45, 46, 47, 48, 49, 50]. It has inspired
force/torque sensor calibration [46, 47, 49] and object color modeling [48] such that a bilinear
formulation of measurement is available. Motivated by the same principle of factorization, we
propose a comprehensive solution for the IMU calibration in arbitrary configurations.
For a camera/IMU system, a multi-step approach [36] was presented to calibrate the param-
eters in the following order: IMU’s intrinsic parameter using a pendulum, relative orientation
using vertical edges and gravity, and then relative translation using a turntable and planer visual
target. Although this method is able to obtain the entire intrinsic and extrinsic parameters, a
4.2. FACTORIZATION METHOD FOR IMU SELF-CALIBRATION 39

Table 4.1: Comparison between sensor calibration methods for inertial and visual sensors

Baseline alg. Reference Parameters Initial SELF COMP CORR

Nonlinear Gravity,
Lang & Pinz [35] S, b, K, Ric Yes No Low No
optimization turntable
Two-step Vertical planar
Lobo & Dias [36] S, b, K, Ric , Tic Yes No Low No
method target, turntable
Mirzaei & Extended Planar
b, Ric , Tic Yes No High Yes
Roumeliotis [37] Kalman filter target
Kelly Unscented Scene points
b, Ric , Tic Yes Yes High Yes
& Sukhatme [38] Kalman filter (SFM)
Gravity,
Our method [50] Factorization S, b, K, Ric No Yes Low No
vertical edges

SELF (self-calibration), COMP (computational load), and CORR (parameter correlation).

significant amount of labor and precision equipment are required. On the other hand, a Kalman
filter-based approach has cast calibration as a state estimation problem [37, 38, 39]. A vision-
aided inertial navigation system had the augmented states for relative pose and IMU bias. An
extended Kalman filter (EKF) [37] or an unscented Kalman filter (UKF) [38] progressively esti-
mated the states from known 3D features. A gray-box system identification technique was also
used to estimate the parameters from the nonlinear optimization of prediction errors in EKF [39].
Common drawbacks of these KF-based methods are that a good initial parameter is essential,
known feature points are required, and no other intrinsic parameters are included but IMU bias.
See Table 4.1 for a more detailed comparison between camera/IMU calibration methods.

4.2 Factorization Method for IMU Self-Calibration

This section provides an overview of the factorization method that is based on the bilinear formu-
lation of IMU measurements. Since no differences exist between accelerometers and gyroscopes
as an inertial sensor, motion refers to either force (acceleration) or angular velocity exerted on
the IMU. The terms, motion and load, are identical and used interchangeably.

Affine measurement model:

Suppose that the IMU is an accelerometer (or gyroscope) cluster of n single-axis components
and its components are arbitrarily aligned to each other as shown in Fig. 4.1.
A measurement model of each component z is affine, z = af + b, i.e., linearly proportional
40 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

f or !
Ric s1
s2 z

ai = ksi k
K bi
sn
s3 s4 f or !

(a) (b)

Figure 4.1: Calibration parameters of a camera/IMU system (K, S, b, Ric ): (a) A camera calibration
matrix K, an accelerometer or gyroscope unit (S, b) composed of n single-axis components in an arbitrary
configuration, and their relative orientation Ric . (b) An affine measurement model (z = ai f + bi ) of a
single-axis inertial sensor component with a scale factor ai = ksi k and bias bi .

to an external motion f and a non-zero for a null motion. In 3D, the measurement zi of an i-th
component in the IMU is expressed as the sum of the projection of a motion f on the sensitivity
axis si and the non-zero bias bi :

zi = f > si + bi , i = 1, . . . , n (4.1)

where the scale factor ai = ksi k.

Bilinear Form:

Once m motions are applied on the n-component IMU, the measurements are collected into
a matrix Z ∈ Rm×n as follows:
 
f1> s1 + b1 f1> s2 + b2 · · · f1> sn + bn
 
f > s1 + b1 f > s2 + b2 · · · f > sn + bn 
2 2 2
Z= (4.2)

 .
.. .
.. .
..


 
>s + b > >
fm 1 1 fm s2 + b2 · · · fm sn + bn

where zij is the j-th component’s output for the i-th motion fi (i = 1, . . . , m, j = 1, . . . , n).
The measurement matrix Z can be rewritten as a product of two matrices, Fb and Sb 1 ,
which have been called an augmented motion matrix and a shape matrix [45], respectively. Each

1
The subscript b is intended to clarify that the bias b is explicitly considered in the shape matrix and the motion
matrix is augmented with an extra column of ones at the right.
4.2. FACTORIZATION METHOD FOR IMU SELF-CALIBRATION 41

measurement zij is a product of an augmented motion vector [fi> 1] and a shape parameter [s>
j bj ].
 
f> 1
 1 "
f > 1 s s · · · s
# " #
h i S
2  1 2 n
Z =  . . = F 1 = Fb Sb (4.3)
 .. ..  b1 b2 · · · bn b
 
> 1
fm
If an external calibration device used in a traditional calibration method captures all of the
motions (f1 , . . . , fm ) precisely, the calibration parameter could become as simple as a least squares
solution, Sb = F+ > −1 >
b Z = (Fb Fb ) Fb Z when m ≥ 4.
In constrast, our factorization-based calibration aims at estimating Sb from partially known
motions that are readily obtainable without calibration devices.

4.2.1 Matrix reconstruction and load constraints

When Z is given, constraints are needed to find a true motion and shape matrix since there exist
infinitely many factorizations that decompose Z into a product of two matrices.
The first constraint is a matrix rank condition on Z. Let p denote the dimension of a cali-
bration parameter per component in Sb (p = 4). Since Z is constructed from Fb ∈ Rm×p and
Sb ∈ Rp×n , the rank of Z should be at most p when min(m, n) ≥ p and Z is noiseless. This has
been called the proper rank constraint [45]. Singular value decomposition (SVD) [51] can factor Z
into two rank-p matrices, F
cb and Scb , as follows:

Z = U Σ V> = (UΣ1/2 )(Σ1/2 V> ) = F


cb S
cb (4.4)

where Σ is a p × p diagonal, U and V are a m × p and n × p unitary matrix, respectively. The


ambiguity remains because it is possible to insert any invertible matrix A4×4 inbetween.

Z=F cb A4×4 ) (A−1 S


cb = (F
4×4 b ) = Fb Sb (4.5)
cb S c

The second constraint is drawn from partially known calibrating motions (f1 , . . . , fm ). Suppose
that each piece of information about the motions is expressed as an inner product:

fi> fj = cij (4.6)

which represents the partial knowledge of a pair of motions or a single motion such as a motion
magnitude kfi k2 = cii and an orthogonal motion fi> fj = 0. We call a set of these motion
constraints load constraint set C. A procedure on how to find A4×4 from the constraint set C is
a key step in the factorization and will be explained in detail in Section 4.3.
Finally, once A4×4 is found from C, the calibration step is completed with the simultaneous
recovery of the shape parameters in Sb and the applied motion in Fb as follows.

Sb = A−1
4×4 Sb ,
c Fb = F
cb A4×4 (4.7)
42 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

g g g g

Figure 4.2: Gravity magnitude constraint Cτ∗=1 for the accelerometer self-calibration: an accelerometer
unit is oriented at different attitudes with respect to gravity in a static condition.

Figure 4.3: Angular speed constraint C ∗ for the gyroscope self-calibration: a camera affixed to a gyroscope
unit is randomly rotated in front of distant scene objects. The angular speed kωk is obtained from feature
correspondences between two consecutive images and a known time interval ∆T .

4.2.2 Natural reference features for self-calibration

One prevalent constraint type used for self-calibration is magnitude. For example, an accelerom-
eter is exposed to an identical load (gravity) whose magnitude is kfi k = g for i = 1, . . . , m, as
shown in Figure 4.2. The magnitude of an angular velocity kωk exerted on a gyroscope is readily
available from a speed-controlled turntable or a rotating camera tracking distant scene objects
as shown in Figure 4.3. The detailed procedure to obtain kωk from an uncalibrated camera will
be described in Section 4.6.2. We define the intra-relation constraint set C ∗ as follows.

Definition 1. C ∗ is a intra-relation load constraint set which is composed only of load magnitudes,
i.e., C ∗ = {cii = fi> fi | i = 1, . . . , m, m ≥ 9}. Also, Cτ∗ refers to when all the magnitudes in C ∗ are
the same as τ , i.e., cii = τ for all i.

From Definition 1, the gravity magnitude constraint can be simply denoted by Cτ∗=1 (a known
constant magnitude is scaled to 1 without loss of generality).

4.2.3 Redundant vs. triad IMU

The proper rank constraint demands that the component size (n) should be no less than the
shape parameter dimension (p = 4), i.e., n ≥ p. This constraint is satisfied by a redundant IMU
4.3. REDUNDANT IMU CASE 43

cb ∈ Rp×n
(n ≥ 4) but violated by a triad IMU (n = 3). The SVD in (4.4) fails to produce a S
when n = 3.
One possible workaround for a triad IMU is to reduce the parameter dimension to p = 3
by excluding the bias b from Sb and iteratively estimate the bias from factorization error. An
iterative factorization solution for a triad IMU will be presented in Section 4.4.

4.3 Redundant IMU Case


The calibration problem is now equivalent to finding a non-singular matrix A4×4 in (4.5). For
a redundant IMU, a linear solution will be derived from a load constraint set C. Note that the
linear solution that follows in this section is valid for any combinations of load constraints, not
confined to C ∗ . Instead interesting properties of solution spaces for C ∗ and Cτ∗ will be discussed
in Appendix B.

4.3.1 Shape recovery procedure from load constraints

Given a load constraint set C in (4.6), a set of linear constraints for A4×4 in (4.5) are constructed.
Let A4×4 be partitioned into the following block matrices.
" #
h i A3×3
A4×4 = A4×3 a4 = a4 (4.8)
u>

Firstly, the last column a4 is immediately recoverable without using C. Since the last column
of Fb = [F 1] is constant, the pseudo-inverse of F
cb finds a least squares solution for a4 :

+
a4 = F
cb 1 (4.9)

Secondly, the remaining columns A4×3 are associated with C. From (4.7), each true load
fi ∈ R3×1 is rewritten as

fi = A>
4×3 fb,i for i = 1, . . . , m
b (4.10)

fb,i ∈ R4×1 is at the i-th row of F


where b b b.
Suppose that k load constraints exist in C. Plugging (4.10) into each load constraint cij
yields one quadratic equation for A4×3 . The Kronecker product rewrites this as a linear equation
in terms of q ∈ R10×1 , which is a vector of lower triangular elements of a symmetric matrix
Q = A4×3 A>
4×3 :

cij = fi> fj = b>


fb,i (A4×3 A> b> b
4×3 ) fb,j = fb,i Q fb,j
b (4.11)
= [b fb,j ]> Vec(Q) = [b
fb,i ⊗ b fb,j ]> T16×10 q
fb,i ⊗ b (4.12)
44 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

f (®) = det(Q)
g(®) = min(eig(Q3£3 ))

(a) case 1: a single solution (b) case 2: multiple solutions (c) case 3: no solution

Figure 4.4: Three cases on the number of solutions for q when rank(L) = 9: Among four roots of f (α),
a true α should be real and make Q3×3 a positive definite. When no noise in Z, a single solution exists
(case 1). Multiple or no solutions for q may exist for noisy Z (case 2 and 3).

where T16×10 is a constant matrix that converts a vectorized Q into a minimal parameter q.
Stacking all the constraints in C produces a linear system equation L ∈ Rk×10 for q:

c = Lq (4.13)

where L = S[b fb,j ]> T16×10 , c is a vector of {cij } in C, and S is a row stacking operator that
fb,i ⊗ b
uses all of the constraints in C.
It is noteworthy that Q ∈ R4×4 is not full-rank but a rank-3 matrix because rank(Q) =
rank(A4×3 ) = 3. Rewriting Q with the block matrices in (4.8)
" # " #
A>3×3 A3×3 A>
3×3 u Q3×3 q13
Q= , (4.14)
u> A3×3 u> u q>13 q44

reveals that Q (equivalently q) is internally subject to the following two constraints, which we
call a q-constraint:

det(Q) = 0 (4.15a)
min{eig(Q3×3 )} > 0 (4.15b)
−1
where the first equality constraint is due to a non-invertible rank-3 Q, i.e., q>
13 Q3×3 q13 −q44 = 0,
and the second inequality constraint indicates that Q3×3 = A>
3×3 A3×3 is a positive definite by
construction and its eigenvalues should be all positive.
When L in (4.13) is full-rank with a sufficient number of load constraints (rank(L) = 10 when
k ≥ 10), a least squares solution q is obtainable from the pseudo-inverse of L. When no noise is
present in Z, the following particular solution qp inherently satisfies the q-constraint as well.

q = qp s.t. qp = L+ c (4.16)
4.3. REDUNDANT IMU CASE 45

Table 4.2: List of solution schemes for the redundant IMU calibration according to rank(L)

rank(L) Solution for q Examples

Lack of load constraints: |C| < |C|min = 9.


<9 q is indeterminate
Degenerate load configuration F ∈ DF in Figure B.2.

q = L+ c + αqn , qn ∈ N (L) Minimal constraints: |C| = 9


9
determine α from q-constraint (4.15) Gravity magnitudes: C = Cτ∗=1 s.t. |C ∗ | ≥ 9

q = L+ c Non-minimal mixed constraint: C − C ∗ 6= ∅


10
enforce rank(Q) = 3 by SVD Different magnitudes: C = (C ∗ − Cτ∗=1 )

On the other hand, even when L is not full-rank, q is still solvable if rank(L) = 9. This is because
q ∈ R10×1 actually has 9 degrees of freedom when the rank q-constraint (4.15a) is considered.
The solution for q spans a one-dimensional null space N of L with an unknown scalar α:

q = qp + αqn s.t. qp = L+ , qn ∈ N (L). (4.17)

The rank and positivity condition in the q-constraint (4.15) are used to determine α. Plugging
(4.17) into the rank condition (4.15a) yields a fourth-order polynomial f (α) whose coefficients2
are parameterized by qp and qn . Among up to four possible roots, a true α should be a real-
valued one that makes Q3×3 a positive definite (4.15b). Fig. 4.4 shows three possible cases for the
number of q solutions depending on the location of α. Only one α meets the q-constraint when
no noise in Z. Table 4.2 summarizes the solution schemes to obtain q according to rank(L).
Once q is obtained, A4×3 = [A> >
3×3 u] in (4.8) is found as follows via the Cholesky decom-
position of a symmetric positive-definite Q3×3 in (4.14):

A3×3 = chol(Q3×3 ), u = A−>


3×3 q13 (4.18)

Finally, the reconstruction of true Sb and Fb is completed as (4.7) by the A4×4 found from
the load constraint set C. Note that the final reconstruction is fundamentally ambiguous up to a
three-dimensional rotation R since Z = Fb Sb = FS + b = (FR)(R> S) + b. This means that the
sensor’s internal coordinate system can be chosen freely and is already reflected in the solution
during the Cholesky decomposition of Q3×3 , which determines A3×3 up to a rotation matrix R.

4.3.2 Linear solution space

Table 4.2 summarizes the solution schemes according to rank(L) and typical examples for each
scheme. One key example is that L is always rank deficient when the gravity magnitude is
2
A symbolic computation of det(Q(qp + αqn )) = 0 produces the coefficients of a fourth-order polynomial f (α).
46 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

Algorithm 4.1: Factorization-based linear solution for redundant IMU calibration


Sb = (S, b) = Redundant IMU Calibration(Z, C)
begin
[U, S, V] = svd(Z)
cb = Up Σ1/2
F
1/2 >
p , Sb = Σp Vp , p = 4
c
foreach cij in C do
Lk = [b fb,j ]> T16×10 in (4.12)
fb,i ⊗ b
if C = Cτ∗=1 or |C| = 9 then
[U, S, V] = svd(L)
qp = L+ c, qn = last col(V)
α = q constraint(qp , qn ) in (4.15)
Q = convert(qp + αqn )
else
q = L+ c, Q = convert(q)
[U, S, V] = svd(Q)
Q = U3 S3 V3 " #
A3×3 +
A3×3 = chol(Q3×3 ), A4×4 = −1 F
cb 1
q>
13 A3×3
Sb = A−1
4×4 Sb , Fb = Fb A4×4
c c

solely used in the accelerometer calibration, i.e., rank(L) ≤ 9 when C = Cτ∗=1 . For the gravity
magnitude, Propositions 1 and 2 in Section B.4 claim that f (α) has one real root for a true α and
that the others are triple roots that violate the positivity q-constraint (4.15b).

More analysis on the solution space of q is described in Section B, including a minimum


number of constraints |C|min and degenerate load condition DF .

4.4 Triad IMU Case

The factorization method allows the shape parameter dimension (p) up to a component size (n).
For a triad IMU (n = 3), the exclusion of the bias from Sb reduces the parameter dimension to
three in S. Then, the bias is iteratively estimated at the outside of the factorization loop using
motion reconstruction errors compared with a load constraint C. Note that the following iterative
algorithm is specific to C ∗ .
4.4. TRIAD IMU CASE 47

4.4.1 Iterative factorization algorithm

Let D represent bias-compensated measurements in which the bias is subtracted from Z, i.e.,
dij = zij − bj for all i, j in (4.2). Since D = FS, a bilinear formulation of D is a product of two
matrices, F and S, and rank(D) = 3. The constraints used for the factorization are identical to
those for a redundant IMU, except that we use D instead of Z.
Suppose that an initial bias b0 is given. At the k-th iteration, the proper rank constraint
(p = 3) on Dk yields

Dk = Z − 1m×1 bk (4.19)
b 3×3 )(A−1 S)
= (FA b (4.20)
3×3

from svd(Dk ) = UΣV> , F b = UΣ1/2 and S b = Σ1/2 V> . True shape and motion matrices are
now indeterminate up to an invertible matrix A3×3 . From fi = A> bfi , each load constraint in
3×3
C ∗ produces one linear equation for q0 :

fi> (A3×3 A>


cii = b b b>
3×3 )fi = fi Q3×3 fi
b (4.21)
= [b fj ]> T9×6 q0
fi ⊗ b (4.22)

where q0 ∈ R6×1 is a vector of lower triangular elements of a symmetric matrix Q3×3 and T9×6
is a constant conversion matrix from a vectorized Q3×3 to q0 . If |C ∗ | ≥ 6, q0 is solvable from a
linear system equation:

L0 q0 = c (4.23)

where L0 = S[b fi ]> T9×6 . Note that q0 is solvable only if L has a full rank because q0 has no
fi ⊗ b
internal constraints like (4.15a). The factorization is completed via the Cholesky decomposition
of Q3×3 into A3×3 . The best reconstruction of Dk is given as Fk = FA b 3×3 and Sk = A−1 S.b
3×3

Since Dk is incompletely compensated due to the bias error in bk , a new bias bk+1 for the
next iteration is given as

cii k
fi = f , i = 1, . . . , m (4.24)
kfik k i
 +
Sb = F 1 Z (4.25)
k+1
b = last-row(Sb ). (4.26)

and the iteration continues until kbk+1 − bk k < ε for a small threshold ε.
Since Dk is contaminated by the bias error in bk , the reconstructed motion and shape matrices,
Fk and Sk , are also incorrect. The least squares solution q0 in (4.23) has a residual that leaves
Fk not fully constrained by C ∗ . Hence, each load constraint in C ∗ is explicitly re-enforced on
48 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

Algorithm 4.2: Iterative factorization-based calibration method for triad IMU


Sb = (S, b) = Triad IMU Calibration(Z, b0 , C ∗ )
k=0
repeat
[U, S, V] = svd(Z − bk )
b = Up Σ1/2
F
1/2 >
p , S = Σp Vp , p = 3
b
foreach cii in C do
L0i = [b fb,i ]> T9×6
fb,i ⊗ b
q0 = L0+ c, Q3×3 = convert(q0 )
A3×3 = chol(Q3×3 ) −→ S = A−1
3×3 S, F = F A3×3
b b
foreach fi in F do

fi = ( cii /kfi k) fi
 +
Sb = F 1 Z, bk = last-row(Sb )
k ←k+1
until kbk+1 − bk k < ε

the reconstructed motion Fk in order to find an improved bias. In other words, the magnitude
of each motion fik is normalized to kfik k2 = cii . Then, a new bias bk+1 for the next iteration is
computed from the updated Fk and the measurement Z.

4.4.2 Bias convergence region

The advantages of this iterative factorization over the nonlinear methods [42, 43, 44] are that it is
based on a linear method and it requires an initial bias only; there is no need to presume initial
values for scale factors and alignments.
A convergence of the bias is demonstrated to be insusceptible to initial bias error. The
simulation and experiments in Fig. 4.8 and 4.16 show that the convergence region is wide enough
to cover kb0 − b∗ k < ksk (when kf k = 1) for a true bias b∗ . When Z is generated by Cτ∗=1 ,
one automatic setting for each initial bias would be b0 = mean(Z) because no measurements are
greater than a scale factor from the bias.

4.5 Simulation Evaluation


The performance of factorization-based IMU calibration is evaluated mainly for an accelerome-
ter unit under the gravity magnitude constraint Cτ∗=1 in Figure 4.2. In the following numerical
simulations, a true shape Sb has a unit scale factor and unit bias, ksi k = bi = 1 for i = 1, . . . , n,
and consists of multiple tri-axial units when each unit is skewed to another.
4.5. SIMULATION EVALUATION 49

IMU bias error (%) IMU scale error (%) IMU alignment error (deg)
20 20 8
σ=4.0%
σ=2.0%
15 σ=1.0% 15 6
σ=0.5%
σ=0.25%
10 10 4

5 5 2

0 0 0
4 6 10 16 24 4 6 10 16 24 4 6 10 16 24
component size (n)

Figure 4.5: IMU calibration performance with respect to component size (n) and noise (σ) when the
gravity magnitude constraint Cτ∗=1 is used.

IMU bias error (%) IMU scale error (%) IMU alignment error (deg)
20 20 8
σ=4.0%
σ=2.0%
15 σ=1.0% 15 6
σ=0.5%
σ=0.25%
10 10 4

5 5 2

0 0 0
9 18 36 72 144 9 18 36 72 144 9 18 36 72 144
load size (m)

Figure 4.6: IMU calibration performance with respect to load size (m) and noise (σ) when the gravity
magnitude constraint Cτ∗=1 is used.

IMU bias error (%) IMU scale error (%) IMU alignment error (deg)
20 20 8
σ=4.0%
σ=2.0%
15 σ=1.0% 15 6
σ=0.5%
σ=0.25%
10 10 4

5 5 2

0 0 0
9 18 36 72 144 9 18 36 72 144 9 18 36 72 144
load size (m)

Figure 4.7: IMU calibration performance of mixed constraint types when the gravity magnitude con-
straint Cτ∗=1 (50%) and the orthogonal motion constraint (50%) are used together.

Performance of gravity magnitude constraint:

The performance of accelerometer calibration in |Cτ∗=1 | was assessed through simulation with
respect to the following three factors: Gaussian measurement noise (σ), the number of loads (load
50 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

Table 4.3: Simulation evaluation of calibration performance with respect to noise (σ) and component
size (n) when Cτ∗=1 is used.

bias (berr , %) scale (kskerr , %) alignment (∠s1 si,err , deg)


σ (%) n=4 n=6 n = 20 n=4 n=6 n = 20 n=4 n=6 n = 20

0.25 -0.01 ± 0.82 -0.01 ± 0.56 0.00 ± 0.31 -0.02 ± 0.84 -0.01 ± 0.59 0.00 ± 0.32 0.00 ± 0.42 0.00 ± 0.22 0.00 ± 0.12
0.5 0.02 ± 1.67 0.01 ± 1.11 0.01 ± 0.60 0.09 ± 1.72 0.04 ± 1.16 0.02 ± 0.63 0.00 ± 0.83 0.00 ± 0.44 0.00 ± 0.23
1.0 0.23 ± 3.53 -0.01 ± 2.17 -0.03 ± 1.31 0.66 ± 3.54 -0.04 ± 2.27 -0.07 ± 1.37 0.01 ± 1.75 0.01 ± 0.86 0.00 ± 0.48
4.0 2.24 ± 27.0 -0.03 ± 10.9 -0.58 ± 5.30 7.43 ± 27.0 0.27 ± 11.4 -1.55 ± 5.33 0.03 ± 8.04 -0.01 ± 3.58 -0.04 ± 1.95
8.0 1.58 ± 45.6 -0.10 ± 23.3 -2.27 ± 13.0 11.1 ± 46.3 1.04 ± 24.2 -5.88 ± 12.1 -0.20 ± 14.8 -0.10 ± 7.21 -0.18 ± 4.36

The load size is fixed to 36 (m = 36). See Figure 4.5 for the plot.

Table 4.4: Failure rate of the factorization for a redundant IMU when Cτ∗=1 is used (%).

noise (%) n=4 n=6 n = 10 n = 20

0.5 0.0 0.0 0.0 0.0


1.0 0.0 0.0 0.0 0.0
2.0 0.2 0.0 0.0 0.0
4.0 9.6 6.7 1.4 0.0
8.0 31.3 30.4 18.4 6.2

b3 = −0.2 b3 = 0.0 b3 = 0.2 b3 = 0.4


−1 −1

0 0

b1 1 b1 1

2 2

3 3
b2 −1 0 1 2 3 b2 −1 0 1 2 3

b3 = 0.6 b3 = 0.8 b3 = 1.0 b3 = 1.2


−1 −1

0 0

b1 1 b1 1

2 2

3 3
b2 −1 0 1 2 3 b2 −1 0 1 2 3

b3 = 1.4 b3 = 1.6 b3 = 1.8 b3 = 2.0


−1 −1

0 0

b1 1 b1 1

2 2

3 3
b2 −1 0 1 2 3 b2 −1 0 1 2 3

Figure 4.8: Bias convergence region of the iterative factorization for a triad-IMU. Each plot is a cross
section along b3 . Initial conditions, b0 = [b1 b2 b3 ], in the white region converge to the true b∗ = [1 1 1]
but those in a non-white region are trapped in local minima.
4.5. SIMULATION EVALUATION 51

size m = |Cτ∗=1 |), and the number of sensor components (component size n). In order to reflect a
real scenario in in-field calibration, the IMU’s attitude (roll and pitch) was random but bounded
within ±60 degrees. Gaussian noise σ was given as a percentage of the scale factor ksi k.
For every triplet (σ, m, n), a Monte-Carlo simulation was performed with 1000 trials. From
Figure 4.5 to 4.7, the bias and scale factor error were evaluated for every sensor component and
the alignment error is about vector angles of all possible pairs of sensor components. Figure 4.5
and 4.6 show that the calibration error decreases linearly as the component size (n) increases
from 4 to 24, the load size (m) increases from 9 to 144, or the measurement noise (σ) decreases
from 8 to 0.25%. Table 4.3 shows the errors in more details and indicates that the factorization
method is an unbiased estimator unless the measurement noise is too severe (σ ≥ 4.0%).
Figure 4.7 shows the case where angular motion constraints (inter-relation constraint) and
Cτ∗=1 (intra-relation constraint) are used together. A half of the constraints is given as fi> fj = 0
from multiple pairs of orthogonal loads [50]. Compared with Figure 4.6 that uses Cτ∗=1 only, the
bias and scale factor error decrease almost by half as the load size increases while the alignment
error is found to be greater for a small m. This fact indicates that additional knowledge about
angular relation makes the bias and scale factor more constrained when m is large.
Since a typical MEMS-based accelerometer [52] has noise less than σ = 0.25%, the calibration
result for an off-the-shelf IMU is expected to be better than 0.01% accuracy and 0.5% precision
when m = 30 and n = 6. The minimal load and component size (m = 9, n = 4) has 0.02%
accuracy and 0.8% precision when σ = 0.25%.
The calibration failure occurs when no solution for q produces a positive definite Q3×3 in
(4.15), which is usually induced by high measurement noise. Table 4.4 shows the failure rate for
1000 trials in the simulation when Cτ∗=1 is used. The failure begins to appear when the noise
is larger than σ = 2.0% and is significantly reduced as the load and component size increase.
Practically, this failure would be of little concern because a real accelerometer noise (σ = 0.2%)
corresponds to a zero failure rate.

Bias convergence region for a triad IMU:

The iterative factorization for the triad IMU in Section 4.4 requires an initial bias b0 . In
order to numerically evaluate its convergence region, we tested for every initial point between
b0min = −[1 1 1] to b0max = [3 3 3] in 0.05 resolution when a true shape is given as b∗ = [1 1 1]
and S∗ = I3×3 .
Figure 4.8 shows the bias convergence region B when m = 10, σ = 0.5% and C = Cτ∗=1 . Initial
biases in the white region converge to the true bias b∗ but those in the black region are trapped
in local minima. Although white regions are divided and scattered, there exists a continuous
convergence region that expands spherically around the true bias. Since the radius of this central
52 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

1.5
Error [deg]
1 ρ mean(θerr ) std(θerr )

0.5 10 -0.437 2.342


20 -0.161 1.179
0
50 -0.037 0.624
−0.5
100 -0.020 0.409
−1 200 -0.021 0.296
−1.5
500 -0.030 0.342
20 50 100 200 500 Ratio(ρ)

Figure 4.9: Systematic error in camera rotation angle θc = phase(eig(H)). The homography H is
estimated by image feature correspondence when the true θc is 10 degrees.

volume is at least one, we can conclude that initial biases in kb0 − b∗ k < 1 are all convergent to
a global minimum. In other words, the initial error allowed for the bias can be as large as 100%
of the scale factor.

Homography angle for gyroscope calibration:

A systematic error exists in θc (H) in cases of non-zero camera translation (tc 6= 0) and finite
scene points (d 6= ∞) Figure 4.11. Let ρ = d/ktc k denote a distance ratio between the distance
d to scene points and camera translation amount ktc k. A rotating camera assumption on θc is
valid only if ρ = ∞. For a finite ρ, a systematic error of θc (H) is evaluated by the simulation
in Figure 4.9. The error is biased when ρ < 100 and its variance decreases logarithmically as ρ
increases. When ρ ≥ 200, the mean squared error of θc becomes less than 1%.

4.6 Camera/IMU Calibration for Sensor Fusion

The factorization framework is extended to the calibration of a camera/IMU system. Gravity


and natural landmarks such as vertical edges and distant feature points constitute constraints for
a collaborative camera/IMU calibration.

4.6.1 Camera vs. accelerometer

Cascaded linear method:

Suppose m images and IMU measurements Z are collected together in a static condition as
shown in Figure 4.10. Once the gravity magnitude Cτ∗=1 calibrates the IMU, gravity directions
(f1 , . . . , fm ) are known to the camera. Because a vertical vanishing point vi is a projection of the
4.6. CAMERA/IMU CALIBRATION FOR SENSOR FUSION 53

vi
y

x vj

y Ii
fi

x
Ij
attitude change
fj image plane

Figure 4.10: Camera/accelerometer calibration: vertical vanishing point viv , which is a common inter-
section of vertical edges, is a projection of the gravity direction fi described in the camera coordinate.

gravity direction fi , there exists an infinite homography H between vi and fi [53]:


" #
h i f
i
vi ∼
= K Rc tc = KRc fi = Hfi (4.27)
0
where H = KRc , (Rc , tc ) is a camera motion and ∼
= represents an up-to-scale equality in homo-
geneous coordinates. For each vertical edge `ij in i-th image, vi and `ij are related as follows.

`>
ij vi = 0 for j = 1, . . . , ni (4.28)

A linear system equation for H is yielded from plugging (4.27) into (4.28) using all of the
vertical edge measurements:
S
`>
ij Hfi = 0 −−−−−−−−→ Mh = 0 (4.29)
i=1:m, j=1:ni

where M = S[`ij ⊗ fi ]> and h is a vector of H elements. Since H is up-to-scale, h is solvable from
at least eight linearly independent constraints. One example is a set of four images containing two
vertical edges for each. Finally, K is obtainable from either QR decomposition of H or Cholesky
decomposition since HH> = KK> .
Another possible solution is to find H from a set of constraints, vi × Hfi = 0. However, this
requires an additional step to estimate vi from {`ij } and it cannot deal with the case that a single
vertical edge is measured in an image.

Collaborative nonlinear method:


A Euclidean angle between two vertical vanishing points (vi , vj ) should be equal to a vector
angle between two gravity loads (fi , fj ), i.e.,
vi> κvj
cos θij = fi> fj = q (4.30)
(vi> κvi )(vj> κvj )
54 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

Figure 4.11: Camera/gyroscope calibration: A homography from feature tracks of distant scene points
over a known time interval provides the magnitude of an angular rate for the gyroscope calibration.

where the angle between two camera rays is computed by the cosine formula weighted by κ =
(KK> )−1 , which is called the image of the absolute conic [53].
A total cost Etotal that combines the load constraint Cτ∗=1 and the image angle constraint
(4.30) on vanishing point measurements is a function of q and κ:

Etotal (q, κ) = Eimu + Ecam/imu (4.31)


m
!2
X X vi> κvj
= (Lii q−1)2 + Lij q − q (4.32)
i=1 (vi> κvi )(vj> κvj )

where fi> fj in (4.30) is replaced with Lij q = [b fb,j ]> T16×10 q in (4.13).
fb,i ⊗ b
Due to the nonlinearity with respect to κ, a nonlinear optimization method numerically seeks
a best (q, κ) that minimizes the squared sum of the combined constraint error, Etotal . An initial
value for (q, κ) can be provided from the cascaded linear solution.

4.6.2 Camera vs. gyroscope

During a pure camera rotation Rc , two images are related by an infinite homography H such
that xi = Hx0i for every corresponding feature point (xi , x0i ) [53]:

H = KRc K−1 (4.33)

Since H and Rc are similar matrices for a transformation K, they share the same eigenvalues
although their eigenvectors are generally different [51]. Since eig(H) = eig(Rc ) = (1, ejθ , e−jθ )
when θ is an angle of Rc , the magnitude of a complex eigenvalue of H is equal to a rotation
angle of the gyroscope affixed to the camera. If a time interval ∆t between two images is known,
an angular rate ω exerted on gyroscopes has a magnitude equal to kωk = θ/∆t, although its
rotation axis is unknown.
When a set of homographies (H1 , . . . , Hm ) is derived from feature tracks over (∆t1 , . . . , ∆tm ),
the factorization method is able to calibrate a gyroscope unit using a load constraint set C ∗ =
4.7. EXPERIMENT RESULTS 55

(kω 1 k, . . . , kω m k). Practically in in-field calibration, the pure rotation assumption of each Hi is
well approximated for distant scene points. Let ρ = d/ktc k in Figure 4.11 denote a ratio between
the distance d to a majority of scene points and camera translation amount ktc k. When ρ ≥ 50,
the mean squared error of θ from H is less than 1% as evaluated in Figure.
The same homography set (H1 , . . . , Hm ) is also used for the rotating camera calibration [54,
55] since the relation κ−1 = Hi κ−1 H> > −1
i yields a set of linear equations for κ = (KK ) .

Relative orientation:

Once a camera and IMU are both calibrated, the two bundles of motion vectors expressed
in each sensor’s coordinates are compared to estimate the relative orientation Ric : {vi , fi } for a
camera/accelerometer and {ω cam
i , ω imu
i } for a camera/gyroscope. A motion of the IMU, fi and
ω imu
i , corresponds to the i-th row of Fb in the factorization. A camera’s angular velocity ω cam
i
is derived from Ri = K−1 Hi K and ∆ti . The best fit of Ric such that vi = Ric fi for i = 1, . . . , m
is computed as a least squares solution Ric = UV> from svd( m > >
P
i=1 fi vi ) = UΣV [56].

4.7 Experiment Results


A series of experiments on the factorization method was conducted using a low-cost small
IMU/camera system shown in Figure 4.12 and 4.13. The redundant IMU (n = 6) was made
of two tri-axial IMUs (O-Navi Gyrocube) which are aligned with a skewed angle. MEMS iner-
tial components are Analog Devices’ ADXL-203 and ADXRS-150 whose measuring ranges are
±2g and ±200 ◦ /sec, respectively. A multi-channel 11-bit A/D converter sampled the output
voltage at 100Hz. The Sentech CCD board camera captured 640 × 480 resolution images at 30
frames/sec.

Accelerometer self-calibration:

To rapidly and reliably collect measurements subject to the gravity magnitude constraint
Cτ∗=1 , we placed the IMU on a tripod and oriented it within ±60 degrees. Various attitudes were
applied randomly, but we made certain to have at least three different roll or pitch angles so as
to prevent a degenerate load configuration DF like in Figure B.2 (a-b).
Figure 4.13 shows how the measurement matrix Z was collected from the output of six ac-
celerometers during the tripod operation. In total, four calibration datasets (E1, . . . , E4) were
generated with more than 20 loads for each, at different times, several days apart. The noise in
Z was measured as 0.73, which corresponds to σ = 0.2% when scaled by an average scale factor
(ksk = 365.0).
Note that the redundant calibration treats two tri-axial IMUs as a whole (n = 6) and the
triad calibration treats each tri-axial IMU individually (n = 3, s123 and s456 ). Table 4.5 com-
56 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

2 x Tri‐axial IMU
Camera

Figure 4.12: An experimental system for the IMU self-calibration. A redundant IMU (n = 6) consists of
two tri-axial MEMS-based IMUs in a skewed alignment. The camera is used to provide load constraints
for the gyroscope calibration.

1250

1000

750

500

0 50 100 150 [sec] 200

Figure 4.13: The output of six accelerometers generated when the IMU is randomly oriented by a tripod.
At each arrow in the plot, the output is collected into the measurement matrix Z when a static gravity
exerts on the accelerometers.
4.7. EXPERIMENT RESULTS 57

Table 4.5: Accelerometer calibration results from both redundant (n = 6) and triad (n = 3) calibration
methods when Cτ∗=1 . (Results are mainly shown for the first three components.)

No. m n b1 b2 b3 b4 b5 b6 ks1 k ks2 k ks3 k ks4 k ks5 k ks6 k ∠s12 ∠s13 ∠s45 ∠s56 ∠s14 ∠s15

E1 36 6 906.7 933.4 927.0 951.4 935.5 918.6 364.5 362.4 366.5 369.1 372.5 367.1 89.89 90.80 89.74 90.17 44.74 51.50
3 906.5 933.9 926.7 952.6 935.9 920.8 364.2 361.9 366.3 370.3 373.6 369.3 89.84 90.85 89.78 89.94 – –
E2 25 6 910.1 935.8 928.5 952.7 935.4 918.9 365.7 364.2 366.4 368.8 372.0 366.9 89.90 90.75 89.97 90.18 44.81 51.50
3 909.6 935.7 928.0 952.8 937.3 921.5 365.7 364.0 365.9 369.6 373.6 369.4 89.90 90.85 90.06 90.11 – –
E3 24 6 905.9 931.0 923.7 949.9 931.8 915.4 363.7 362.2 364.0 368.0 370.8 364.9 89.84 90.59 89.75 90.03 44.74 51.30
3 905.5 931.0 922.5 949.3 932.6 916.9 363.5 361.8 362.6 368.9 372.0 366.1 89.82 90.68 89.69 90.22 – –
E4 20 6 905.3 932.1 927.9 945.0 928.8 915.2 364.7 363.1 367.9 366.1 369.8 367.3 90.11 90.90 90.10 90.31 44.93 51.59
3 905.3 932.1 922.4 943.8 928.5 916.2 361.8 361.5 362.3 365.7 368.1 368.3 90.08 90.95 90.69 90.39 – –

Bias Variance (%) Scale Variance (%) Align Variance (deg)


6 6 2
E1
E2
3 E3 3 1
E4
0 0 0

−3 −3 −1

−6 −6 −2
b1 b2 b3 b4 b5 b6 s1 s2 s3 s4 s5 s6 s12 s13 s45 s46 s14 s15
Figure 4.14: Estimation variance of accelerometer parameters: each parameter individually resulting
from four experiments is compared using two methods, left (red) from the redundant calibration (n = 6)
as a whole and right (blue) from the triad calibration (n = 3) of each tri-axial IMU in Figure 4.12.

(Top View) Loads(f1,...,f20)


Acc(s1,s2,s3)
Acc(s4,s5,s6)

Figure 4.15: The accelerometer’s shape and motion matrix (F and S) are reconstructed from the
factorization-based redundant calibration (n = 6) in E4 experiment. All of the gravity loads are located
on a unit sphere. Two tri-axial IMU are (s1 , s2 , s3 ) and (s4 , s5 , s6 ).
58 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

1200 1200

1100 1100

b3
1000 1000

b2 b2
900 900 b1
b1

800 800 b3

700 700

0 250 500 750 1000 0 250 500 750 1000


Iteration Iteration
Figure 4.16: Convergence rate of the bias in the iterative linear calibration (n = 3) for the first three
accelerometer components in E4 experiment. Initial bias b0 is given as [1100 1100 1100] (left) or mean(Z)
(right).

pares accelerometer parameters calibrated by these two methods, respectively, and shows little
difference between them.
The estimation variance of calibration parameters over four experiments is plotted in Fig-
ure 4.14. Deviations of the scale factor ksi k and alignment ∠si sj for each component are less
than 0.5% and 0.1◦ , respectively. They are nearly the same as the simulation evaluation (0.59%
and 0.22◦ when σ = 0.25%, m = 36 and n = 6 in Table 4.3). The bias b has relatively large
deviation of 1% since the operating temperature severely affects the bias of a MEMS component
but the temperature was not maintained between the experiments.
An initial bias for the triad calibration was automatically given as the mean of measurements,
i.e., b0 = mean(Z) and no convergence failure occurred in Table 4.5. The convergence rate in
Figure 4.16 illustrates that the bias asymptotically approaches a steady state. For some initial
condition, an overshoot is observed at the very beginning of the iteration. The convergence takes
about 500 iterations when the distance of b0 from a true b is 75% of the gravity magnitude.
Figure 4.15 shows the accelerometer shape and motion matrix, S and F, reconstructed by the
factorization method. Each load in F located on a unit sphere reveals an applied attitude, and
the shape of S recovers two tri-axial units arranged in a skewed angle.

Gyroscope self-calibration:

Two sets of outdoor images (E5 and E6) in Figure 4.17 were collected at different distances
from scene objects while we oscillated the camera on a tripod multiple times with different axes.
The camera translation was limited to ktc k ¡ 0.1m. Since the camera is located far from main
buildings (d ¿ 50m for E5 and d ¿ 300m for E6), the pure camera rotation assumption is validated
by ρ ≥ 500.
The measurement matrix Z was filled with the m peaks of the gyroscope output in Figure 4.17.
Although any output could be added in Z, the peaks are practically preferred because they are
4.7. EXPERIMENT RESULTS 59

expected to have lower variance of angular speed between images, and convenient for automatic
measurement selection. The gyroscope noise was measured as 2.0 equal to σ = 0.66% (scaled
by ksk = 300). Then, two neighboring images (Ii , Ii0 ) closest to each peak zi are chosen for the
homography computation in Section 4.2.2. More than 200 feature points xi are selected in Ii
by the Harris corner detector and then tracked to x0i in Ii0 by KLT feature tracking [57]. Given
the feature correspondence (xi , x0i ), the homography Hi such that x0i = Hi xi is found based on
RANSAC to remove outliers [53]. Finally, the angular speed kω i k = θi /∆ti is obtained from θi =
phase(eig(Hi )) and a fixed frame interval ∆ti = 1/30s. These m angular speeds (kω 1 k, . . . , kω m k)
constitute the load constraint C ∗ for the gyroscope calibration (m = 128 for E5 and m = 180 for
E6).
Table 4.6 compares gyroscope parameters found by the redundant (n = 6) and triad (n = 3)
calibration, respectively, in exactly the same way as is done in Table 4.5. Note that an additional
comparison is made for the gyroscope bias obtained at a stationary condition (ω = 0). The
estimated parameters are almost identical regardless of the calibration method with less than
0.3% deviation. The estimation variance between E5 and E6 is at most 1% at the third and
fourth components.
Figure 4.18 shows the reconstructed shape and motion matrix, S and F, of the gyroscope unit.
The distribution of F reveals that we have rotated the IMU around the same axis multiple times
but with different magnitudes.

Camera/accelerometer calibration:

The images and IMU measurements are collected together when the sensor system is placed
ahead of office doors to obtain natural landmarks of vertical line segments, as shown in Figure 4.13.
Two camera lenses of focal length 3.8mm and 8.0mm are used for the images in Figure 4.19, with
different distances to the doors. The ground truth of K for each lens is precisely calibrated by
the DLR camera calibration toolbox [58] using a checkered board. The results in Table 4.5 are
used for the ground truth of the IMU calibration.
We use the Canny edge detector to extract straight lines from images. Each edge chain
is recursively divided into piecewise segments according to thresholds such as line fitting error
and line length. For automatic selection of vertical edges among all those detected, the lines
are clustered according to vanishing points in a RANSAC manner [25]. Either a dominant line
group or the one consistent with IMU loads are selected as a true vertical line group Li in each
image. Then, a maximum likelihood estimate of the vanishing point vi for Li is found through
nonlinear minimization of the squared sum of orthogonal image distances when the image noise
is Gaussian [59].
Table 4.8 compares camera calibration results between the cascaded linear method and the
60 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

ω = 2.75 rad/s ω = 2.66 rad/s ω = 1.95 rad/s ω = 2.48 rad/s ω = 2.28 rad/s

ω = 1.95 rad/s ω = 2.37 rad/s ω = 2.32 rad/s ω = 3.31 rad/s ω = 3.35 rad/s

ω = 2.52 rad/s ω = 2.61 rad/s ω = 2.39 rad/s ω = 2.22 rad/s ω = 2.41 rad/s

ω = 2.45 rad/s ω = 1.10 rad/s ω = 1.97 rad/s ω = 2.56 rad/s ω = 3.07 rad/s

1700

1400

1100

800

500

0 5 10 15 [sec]

Figure 4.17: (Top) Feature correspondences of distant scene points between two neighboring frames
around the peaks of the gyroscope output at bottom. (Bottom) The output of six gyroscopes generated
when the IMU is oscillated multiple times in different axes by a tripod.
4.7. EXPERIMENT RESULTS 61

Table 4.6: Gyroscope calibration results from the redundant (n = 6) and triad (n = 3) calibration
methods when C ∗ is given by image feature tracks.

No. m n b1 b2 b3 b4 b5 b6 ks1 k ks2 k ks3 k ks4 k ks5 k ks6 k ∠s12 ∠s13 ∠s45 ∠s56 ∠s14 ∠s15

E5 128 6 925.5 924.3 1047.0 1099.9 952.7 1017.3 291.2 296.3 291.7 299.9 280.3 298.5 92.42 90.78 91.09 89.43 44.99 51.88
3 924.9 925.0 1047.2 1099.3 953.5 1017.1 290.8 296.2 291.6 299.4 280.0 298.5 92.62 90.69 91.12 89.42 – –
ω = 0 926.9 923.6 1046.0 1100.0 953.4 1015.7 – – – – – – – – – – – –
E6 180 6 923.1 925.2 1041.6 1091.5 948.9 1015.0 290.0 294.2 290.2 292.4 286.3 297.2 90.09 89.70 90.88 89.11 46.25 50.43
3 922.8 924.6 1041.3 1091.0 948.5 1014.8 289.8 294.3 290.3 293.3 285.3 297.2 90.12 89.71 90.87 89.07 – –
ω = 0 927.4 925.1 1042.0 1094.7 952.3 1013.8 – – – – – – – – – – – –

(Top View) Loads(f1,...,f128)


Gyro(s1,s2,s3)
Gyro(s4,s5,s6)

Figure 4.18: The gyroscope’s shape and motion matrix (F and S) reconstructed from the factorization-
based redundant calibration (n = 6) in experiment E5. Each angular velocity in F has a different magni-
tude. Two tri-axial IMU are (s1 , s2 , s3 ) and (s4 , s5 , s6 ).

Table 4.7: Comparison between the factorization and constrained nonlinear optimization in the ac-
celerometer calibration (E4). EZ and EC are in the RMSE (root mean squared error). ∆b is a mean
difference of b and so are ∆ksk and ∆∠sij .

Factorization Nonlinear Parameter difference


No. EZ EC EZ EC ∆b ∆ksk ∆∠s

E1 0.0934 0.0009 0.1917 0.0000 0.002 0.005 0.002


E2 0.1226 0.0006 0.1763 0.0000 0.010 0.023 0.003
E3 0.1225 0.0007 0.1729 0.0000 0.024 -0.016 0.006
E4 0.1159 0.0009 0.1700 0.0000 -0.017 0.010 -0.009
62 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

1: n = 0 2: n = 0 3: n = 1 4: n = 1 5: n = 1

6: n = 5 7: n = 3 8: n = 3 9: n = 8 10: n = 8

11: n = 6 12: n = 11 13: n = 10 14: n = 10 15: n = 9

16: n = 5 17: n = 4 18: n = 7 19: n = 4 20: n = 3

21: n = 0 22: n = 0 23: n = 0 24: n = 0 25: n = 0

1: n = 5 2: n = 5 3: n = 2 4: n = 5 5: n = 5

6: n = 7 7: n = 4 8: n = 5 9: n = 6 10: n = 7

11: n = 7 12: n = 8 13: n = 6 14: n = 6 15: n = 6

16: n = 7 17: n = 7 18: n = 4 19: n = 5 20: n = 5

21: n = 5 22: n = 2 23: n = 3 24: n = 2

Figure 4.19: (Top) Images taken with an 8.0mm camera lens at 25 different attitudes (E2). The vertical
edges along the gravity direction at office doors produce vanishing points in 15 images. (Bottom) Images
taken with a 3.2mm camera lens at 24 different attitudes (E3). The vanishing points are produced in all
of the images. The bottom images are taken closer to the doors than those images at the top.
4.7. EXPERIMENT RESULTS 63

Table 4.8: Camera/accelerometer calibration result (use all the measurements, m = 25 or 24)

fx fy s cx cy b ksk ∠s

E2 1083.29 1083.25 0.00 303.29 247.34 930.24 367.41 –


LIN 30.30 26.33 -6.35 1.20 -10.33 0.00 0.00 0.00
NON -6.18 -9.94 0.00 0.96 -9.09 -0.70 -0.90 -0.08

E3 361.72 360.39 0.00 301.28 230.65 926.38 365.80 –


LIN -2.24 -5.62 -1.80 1.73 -1.92 0.00 0.00 0.00
NON -5.54 -7.76 0.00 0.36 -3.79 -0.28 -0.68 -0.07

E#: ground truth


LIN: cascaded linear method errors, NON: collaborative nonlinear errors
Camera calibration matrix K = [fx s cx ; fy 0 cy ; 0 0 1]

Table 4.9: Camera/accelerometer calibration result (under the minimal condition, m = 9, n = 4)

No. fx fy s cx cy b ksk ∠s

E2 1083.29 1083.25 0.00 303.29 247.34 931.76 366.35 –


LIN 49.67 46.72 -2.44 -6.46 -18.00 3.18 -2.07 -0.51
NON -20.75 -19.16 0.00 -8.96 8.67 -1.85 -1.55 -0.12

E3 361.72 360.39 0.00 301.28 230.65 927.66 364.65 –


LIN 6.93 -23.10 -0.58 -17.23 8.09 1.16 14.43 1.92
NON -3.41 -8.64 0.00 -6.16 -1.10 -2.61 2.56 0.64

In Figure 4.19, only nine images and their corresponding zi are selected.

collaborative nonlinear method in Section 4.6, when all of the available measurements and con-
straints are used. The linear method reports an acceptable camera calibration performance for
both camera lenses. The maximum errors are bounded to a 2.7% focal length and 4.1% princi-
pal point. The nonlinear method reduces the focal length error to 0.9% in E2, while producing
essentially no changes in IMU calibration parameters. Note that the camera skew s is set to zero
in the nonlinear method in order to prevent overfitting.
The advantage of the collaborative nonlinear calibration is more clearly revealed in an un-
favorable condition where smaller calibration data sets are available. We intentionally limit the
data to the minimal condition for the linear method. Only nine IMU measurements and corre-
sponding images are selected and used for the calibration of four sensor components (m = 9 and
n = 4). The results in Table 4.9 show a significant reduction in both camera and IMU calibration
errors in the nonlinear method. The maximum errors are reduced from 6.3% to 2.3% in the focal
length, from 7.2% to 2.9% in the principal point, and from 3.8% to 0.5% in the IMU scale; these
reductions are caused by additional angle constraints in (4.30) between IMU and camera data.

Camera/gyroscope calibration:
64 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

From the desk and aerial scenes, we selected 30 camera motion pieces (n = 30) from each
scene, either in a steady rotation or at a local maxima of motion speed. Feature correspondences
were collected over two frames for the desk scene and four frames for the aerial scene. Some
examples of feature tracks are shown in Figure 4.20. The distance ratio ρ is roughly 20 for the
desk scene (ktk = 10cm, d = 2m) and 50 for the aerial scene (ktk = 2m, d = 100m). The ground
truth of K is obtained using a planar checkered board by the DLR camera calibration toolbox
[58], while that for S is obtained from a product datasheet. The ground truth of Ric is measured
from a mechanical drawing of the sensor plate in Figure 4.12 and corresponds to a zero angle.
Table 4.10 shows the automatic calibration results of the aerial scene obtained from linear
methods. As expected from the simulation results, the refinement step significantly increases
calibration accuracy compared with initial linear solutions. Note that the bias v is excluded
in the total refinement, since n = 30 is not sufficient for preventing the overfitting of increased
parameters. In the aerial scene, the linear camera calibration fails in Cholesky decomposition
due to the high noise presented in the homographies, which are derived from feature tracks in
low-resolution images. The alignment in the gyro shape S is relatively less accurate than other
parameters because the quality of calibration inputs is limited; rotation angles from homographies
are slightly biased and unsteady motions during ∆t cause the gyro measurements to become more
noisy than the feature tracks. Based on the ability of the single-parameter model S∗ to return
a smaller calibration error, we can deduce that the calibration dataset is not informative enough
to precisely recover the gyro’s internal alignment.
It is noteworthy that camera motions in the aerial scenes are not specifically intended for the
purpose of calibration, but for typical use of a hand-held device and aerial robot during normal
operation. This fact clearly demonstrates that our method works for on-the-fly calibration when
it is necessary to run gyro-aided feature tracking with no calibration priors in real scenarios.
A calibration dataset collected as principled a way as the simulation input would contain more
informative constraints and further improve the calibration accuracy.

Factorization method vs. nonlinear optimization:


It is possible to cast IMU calibration as a constrained nonlinear optimization problem. Using
the following cost Eimu for a given measurement Z and a load constraint set C, optimal parameters
(F, S, b) can be found by minimizing Eimu .

Eimu = EZ + EC (4.34)
m X
X n k
X
= (fi> sj +bj −zij )2 + λ (fi> fj − cij )2 (4.35)
i=1 j=1 (i,j)∈C

where λ is a Lagrange multiplier for the constraint C. This cost function is the same as that for
the bundle adjustment [60] in computer vision, which refines a visual reconstruction to produce
4.7. EXPERIMENT RESULTS 65

Table 4.10: Auto-calibration results on the aerial scene experiment. A total of 30 homographies were
used (n = 30). The Cholesky decomposition failed due to high noise in the homographies.

K (320 × 240) S Euler(Ric )(deg)


     
286.34 0.01 5.02 298.51 0 0 0.00
Ground-truth 0 286.53 −0.72 0 298.51 0  0.00
     
 
0 0 1.00 0 0 298.51 0.00
 
265.10 −108.67 −76.82
Linear Cholesky fails 0 239.63 −35.81 -
 

0 0 245.34
     
208.69 0 0 281.83 0 0 −3.45
Linear∗ 0 208.69 0 0 281.83 0  1.94
     
 
0 0 1.00 0 0 281.83 −8.06
     
270.38 3.58 −2.19 259.23 13.01 −78.94 2.50
Refinement 0 272.28 −44.48 0 274.06 −56.90  0.79
     
 
0 0 1.00 0 0 285.88 −9.03

ω = 1.46 rad/s ω = 1.20 rad/s ω = 1.20 rad/s ω = 1.14 rad/s ω = 1.21 rad/s

ω = 0.90 rad/s ω = 1.19 rad/s ω = 1.31 rad/s ω = 1.22 rad/s ω = 1.47 rad/s

1800

1400

1000

600

200
0 5 10 [sec] 15

Figure 4.20: Feature correspondences and the output of tri-axial gyroscopes used for online calibration
during the aerial maneuvering
66 CHAPTER 4. VISUAL/INERTIAL SENSOR CALIBRATION

jointly optimal structure and camera poses.


Generally, many solutions to this nonlinear optimization are numerically ill-conditioned or
result in computationally complex approaches with many convergence hurdles. Moreover, high
sensitivity to parameter variations requires a good initial parameter to avoid local minima.
In contrast, the factorization method is not only numerically stable but also globally optimal.
The SVD of Z in (4.5) using the proper rank guarantees that EZ in (4.34) is always at a global
minimum regardless of the subsequent reconstruction step. A minimum EZ is equal to kZ −
F cb kF in (B.1) and a minimum EC is achieved by the least squares solution Lq = c in (4.13).
cb S
Table 4.7 shows that the factorization and nonlinear optimization have almost the same mini-
mum cost and the calibration parameter difference between them is negligible. A small difference
between EZ and EC is caused by the fact that the constrained optimization strictly enforces EC
to zero, while EZ increases slightly as a trade-off.
Chapter 5

Motion Planning and Control

The problem of motion planning for a fixed-wing aircraft is more difficult than for other types
of aircraft. In general non-acrobatic maneuvers, fixed wings cannot hover and climb vertically
or make zero radius turns because they require a minimum forward velocity to maintain the lift
against gravity. Given the change of a target’s location and orientation, the corresponding vehicle
motion should be planned deliberately, as turning requires significant time and space.
The goal of this chapter is to develop a real-time motion planning method that allows a fixed-
wing UAV to operate toward a target in the presence of obstacles and to reliably accomplish
the air-slalom task in Figure 2.1. In order to respond flawlessly to newly identified targets and
obstacles in the task, a motion planner needs to arrange a feasible path that respects kinematic
and dynamic constraints of fixed-wings. Computational load should also be considered carefully
for fast replanning with no significant delays when facing limited computation resources.
In this chapter, we present two approaches for this motion planning problem. Both are based
on a set of motion primitives that are sampled from controllable vehicle behaviors and can be
interconnected at trim states. The first one is an online search-based planning method that
explores an incremental reachability motion tree to find an obstacle-free path. The search is
efficiently bounded by a 3D distance metric that is based on 2D Dubins curves. The second is a
global planning method that precomputes optimal paths for all of the discretized vehicle states
in an obstacle-free space. The motion uncertainty due to external disturbances will be explicitly
considered by a probabilistic motion model using a Markov Decision Process.

5.1 Related Work

The problem of UAV motion planning is especially challenging due to several complexities that
have not been addressed by traditional planning strategies. Differential constraints become more
important to find a feasible trajectory. Wind disturbance makes it impossible for the vehicle

67
68 CHAPTER 5. MOTION PLANNING AND CONTROL

to precisely follow a pre-computed plan. Limited sensor capabilities increase uncertainty of not
only vehicle and target states but also the knowledge about the environment. Previous research
in robotics has explored a vision-based navigation [61], basic obstacle avoidance during flight [62]
and 3D local trajectory planner with a predefined global path [63]. There was a reactive steering
for flocking, grouping and obstacle avoidance [64] and an online search and trajectory precompu-
tation [65], both of which are targeted for computer animation.
Many planning algorithms to produce 2D or 3D trajectories for autonomous aircraft guid-
ance in known or unknown environments are based on a decomposition approach; first solve a
dynamics-unconstrained planning problem, and then smooth the found path to a feasible tra-
jectory using differential constraints. Traditional and state-of-the-art planning algorithms in
textbooks [66, 67, 68, 69] and review papers [70, 71, 72, 73] may be used in this approach.
Our approach to planning with differential constraints is closely related with a finite-state
motion model called a maneuver automation [74]. Trim states of an air vehicle has been effectively
used for motion planning of a complicated nonlinear and high-dimensional dynamic system [74,
75, 76, 77, 78, 79, 80]. The concept of MA is based in part on the fact that human pilots can
accomplish nimble control using a mixture of trim trajectories and maneuvers. The general idea
of finite state models is to search or optimize a path connected in a discretized finite-dimensional
space instead of an infinite one. This significantly reduces the computational complexity of a
motion planning problem under differential constraints.

5.2 Motion primitives


Let a nonlinear differential system ẋ = f (x, u) be the dynamics of a fixed-wing airplane. The
vehicle state x is a high dimensional variable, x = (x, y, z, φ, θ, ψ, u, v, w, p, q, r, α, β), and the
control inputs u are a thrust and surface deflections. Using a feedback controller u∗ during time
k to k + 1, this complex dynamic system can be reduced to a discretized dynamic system fd that
makes the transition from a trim state sk to another trim state sk+1 .

sk+1 = fd (sk , u∗ (k, k + 1)) (5.1)

Since the dimension of a trim state is much lower than that of the original state, the motion
planning on a trim-state space is more advantageous to run in realtime when a set of motion
primitives are built in such a way that they interconnect discretized trim states.

5.2.1 Trim states

In aerodynamics literature [81], a fixed-wing’s trim state is referred to as one of the following
three motions: (1) level flight, (2) constant-altitude turn, and (3) turn with a constant climb
5.2. MOTION PRIMITIVES 69

S0;0

S0;1 S -1;1

S1;2 S -1;0

(a) (b)

Figure 5.1: Trim states of a fixed-wing airplane: (a) three motions in a trim state: level flight, constant
altitude turning, and turning with a constant climb rate. (b) a finite state machine (FSM) defined by trim
states s(φ, θ, Va ).

rate, as illustrated in Figure 5.1(a). Though an equilibrium state ẋ = 0 is generally called trim,
trim states of a fixed wing include cases where some vehicle states are not constant. In turning
with a constant climb rate, the altitude and the yaw angle change at constant rates (ż = Va sin γ
and ψ̇ = Va /R) when a climbing angle γ, turning radius R and airspeed Va are given.
In the trim conditions above, the position (x, y, z) and the heading ψ in an inertial frame
are irrelevant to the description of the motions and the remaining states are constrained by the
dynamics f and three parameters (γ, R, Va ). Since γ is related to the pitch θ by an attack angle
α and the roll φ = Va /R at the constant speed Va , all of the trim conditions can be described by
the following trim state s:

s = (φ, θ, Va ) (5.2)

The control input u∗ in (5.1) is required to drive the airplane to a given trim state. Suppose
that there exist three controls (e.g., thrust, rudder and elevator) that are generally available
in a small model airplane. Given a reference profile for desired vehicle states (φd , θd , Va,d ), the
following PID (proportional-integral-derivative) controllers are used to minimize the error be-
tween measured and desired states. Each controller is independently dedicated to control the
corresponding trim variable and is implemented in autopilot.
Z
ut (t) = Kp,t (Va,d − Va ) + Ki,t (Va,d − Va ) airspeed hold (5.3)
Z
ur (t) = Kp,r (φd − φ) + Kd,r (φ̇d − φ̇) + Ki,r (φd − φ) roll hold (5.4)
Z
˙
ue (t) = Kp,e (θd − θ) + Kd,e (θd − θ̇) + Ki,e (θd − θ) + θlevel pitch hold (5.5)
70 CHAPTER 5. MOTION PLANNING AND CONTROL

where θlevel is an angle of attack that holds an altitude at the air speed Va . Vehicle behaviors
are then governed by the closed-loop dynamic system fd , in which the past desired trim state sk
advances to a new desired state sk+1 by the feedback control u∗ during time Ts .

5.2.2 Generation of motion primitives

There exist infinitely many motion primitives, which of each makes a transition between two
points in a continuous space of trim states. To select a limited number of transitions in building
motion primitives, we confine the trim states to a one-dimensional space s∗i = s(φi , θ∗ , Va∗ ) where
θ∗ and Va∗ are predefined constants. The roll angle φi is discretized and its change to another
trim state creates a lateral motion primitive. While the pitch angle θ is always θlevel at both ends
of a motion (i.e., θ∗ = θlevel at trim), it is not necessarily constant during a transition so that it
can create a longitudinal motion primitive.
Figure 5.2(a) shows the ramp and trapezoidal motion profiles of desired roll and pitch angles
(φd (t) and θd (t)), respectively, which are used in the PID controllers during a transition time Ts
between the discretized trim states s∗ . For the pitch, the peak of θd (t) determines the amount
of altitude change. Each profile provides a sufficient settling time so that the error from a final
value remains small and the angular rates (p, q, r) become stationary under small ripples. We
give a rise time Tr = 0.5s and Ts = 2s for lateral motions and Ts = 3s for longitudinal motions
because the pitch angle requires additional time to be regulated.
A set of motion primitives in Figure 5.3(b) is generated using nine discrete roll angles and
five different pitch peaks, φi = {−60◦ , −45◦ , −30◦ , −15◦ , 0◦ , 15◦ , 30◦ , 45◦ , 60◦ } and θd (t)peak =
{−30◦ , −15◦ , 0◦ , 15◦ , 30◦ }, so that each starting s∗i has 45 motion branches reaching to another
s∗ . More complex vehicle behaviors are possible when these motion primitives are interconnected.
Given a dynamic model, motion primitives can be computed offline and saved in a look-up table.
Figure 5.4 shows a reachability tree when the vehicle begins a level flight and the node expansion
stops at the third depth.
Relationships between the motion primitives can be described by a deterministic finite state
machine (FSM) between the trim states (s∗1 , . . . , s∗9 ), as shown in Figure 5.1(b). In the FSM, each
motion primitive is equivalent to a transition function from one state s∗i to another s∗k . For each
pair of (s∗i , s∗k ), there exist five different longitudinal motion primitives depending on θd (t).

5.3 Search-based Motion Planning

Given the motion library F in Figure 5.3, a motion planning problem for the air-slalom task is to
find a path that connects two points in the configuration space C = {x, y, z, θ, φ}. In this discrete
planning one may use any graph search algorithm such as depth-first or A* to find a path from
5.3. SEARCH-BASED MOTION PLANNING 71

Ádesired
1.5 1.5
rolld rolld
1
1 roll roll
Ád pitch 0.5 pitch
0.5 Va Va
0
0
-0.5

-0.5 -1
tr tf 0 50 100 150 200 250 0 50 100 150 200 250 300 350

μdesired
3 4
roll rate roll rate
μd 2 pitch rate pitch rate
yaw rate 2 yaw rate
1

0
0

-1 -2
tr tf 0 50 100 150 200 250 0 50 100 150 200 250 300 350

(a) (b) (c)

Figure 5.2: PID control response between one-dimensional trim states s∗ : (a) roll reference profile φd (t)
for a lateral motion primitive and pitch reference profile θd (t) for a longitudinal motion primitive, (b) roll
angle control response in simulation, (c) pitch angle control response in simulation.

Figure 5.3: A set of motion primitives that make transition between one-dimensional trim states s∗
(θ∗ = θlevel , Va∗ = Vcruise ). Each set consists of nine discretized φk in a resolution of 15◦ for lateral
motions and five different θ profiles for longitudinal motions.

Figure 5.4: A reachability tree of the motion primitives in Figure 5.3: when expanded by nine lateral
and five longitudinal actions at each node, the third stage produces 91,125 new nodes.
72 CHAPTER 5. MOTION PLANNING AND CONTROL

xG
C xI
xI S xG
C
C
C

(a) (b)

Figure 5.5: A heuristic function that estimates the optimal distance to a goal: (a) Two examples of 2D
Dubins curves are the path with the shortest length between xI , xG ∈ SE(2) (b) The 3D Dubins metric
in the configuration space C = {x, y, z, φ} for the air-slalom task
.

a starting state to a goal state. A search tree like that in Figure 5.4 is incrementally constructed
through a state transition in F.
Among many search algorithms, a selection of affordable online planning is restricted by the
fact that the branching factor b at each node is quite large in order to cover lateral as well as
longitudinal motions (b = 45). If a goal is placed d steps away from a starting point (d = 10 ∼ 15
for sufficient space for obstacle avoidance), the search tree size in an exhaustive search like the
breadth-first would grow exponentially with respect to d, i.e., O(bd ). Hence, the trade-off between
path quality and computational load needs to be considered.
An informed search like the best-first is more appropriate for online planning, but it requires
accurate heuristic information about the distance to a goal in order to reduce search time. There-
fore, we approximate the shortest path to a goal using a Dubins-based heuristic, which closely
estimates trim state-based 3D behaviors of a fixed-wing.

5.3.1 3D Dubins heuristic

The Dubins curves have been used as an efficient distance metric for car-like mobile robots [67,
82, 83]. Like the examples in Figure 5.5(a), when a turning radius is lower bounded, the shortest
path between two configurations (xI , xG ) ∈ SE(2) can be expressed as a combination of no more
than three motion primitives. The motion primitive for a car is either a straight line or circular
path of a minimum radius.
The trim state-based motion primitives in Figure 5.3 can also be considered to be bounded
by a minimum horizontal turning radius Rmin and a maximum climbing rate µmax . Therefore,
it is possible to approximate their motions using the following helical kinematic equation [69], in
5.3. SEARCH-BASED MOTION PLANNING 73

Algorithm 5.1: Computation of 3D Dubins distance metric


d3D = Dubins 3D Metric (ci (xi , yi , zi , φi ), ck (xk , yk , zk , φk ); Rmin , µmax )
begin
d2D = Dubins 2D((xi , yi , φi ), (xk , yk , φk ))
∆z = |zi − zk |
while ∆z/d2D > sin(µmax ) do
d2D ← d2D + 2πRmin
p
Return d3D = (d2D )2 + (∆z)2

which two inputs are a roll angle φ and altitude change rate µ:

ẋ = cos θ, ẏ = sin θ, θ̇ = tan φ, ż = µ (5.6)

where tan φmax = 1/Rmin and the air speed is normalized to 1. A non-zero constant input φ and
µ to (5.6) generates a helical motion, which is the same as the trim motion that turns with a
constant climb rate in Figure 5.1(a).
When a vehicle is located at ci = (xi , yi , zi , φi ) with respect to a goal point cg = (xg , yg , zg , φg ),
the distance between two points d(ci , cg ) is approximated by the traveling distance of the kine-
matic equation (5.6) whose inputs are fixed to two parameters Rmin and µmax . In Figure 5.5(b),
the 3D path length can be approximated as follows: (i) compute a 2D Dubins distance metric
d2D on the horizontal plane while altitude change is ignored, (ii) add one revolution of a circle
with a radius Rmin until the horizontal traveling distance d2D is long enough to satisfy the max-
imum climbing angle µmax , (iii) return a combined distance of the horizontal distance d2D and
the vertical altitude change. See Algorithm 5.1 for further detail.
Using the phase partitions between the words in Dubins curves [83], the path length d2D can be
quickly obtainable without expensive computation. One concern is discontinuity of the distance
metric at partition boundaries. A small change in configurations may cause a big change in the
estimate to a goal near partition boundaries in sampling-based motion planning. In practice,
therefore, we allow a small margin ±10◦ for the goal orientation φg and find a minimum under
the margin to determine the shortest distance for d2D .

5.3.2 Greedy best-first search

A greedy best-first search in Algorithm 5.2 sorts a priority queue according to the heuristic func-
tion d3D so that the node estimated closest to a goal is exploited first. The path found in this
way is not guaranteed to be optimal. In most cases, however, far fewer nodes than the worst case
O(bd ) are explored until the goal is found, hence, much faster running time is expected. The use
74 CHAPTER 5. MOTION PLANNING AND CONTROL

Algorithm 5.2: Greedy best-first search using trim state-based motion primitives
p = Greedy Best First Search (xstart , xgoal ; F, O)
Set priority queue Q = ∅, kmax
begin
Q.insert(xs , Dubins 3D Metric(xstart , xgoal ))
while Q =
6 ∅ or k < kmax do
x = Q.top()
if x ∈ G(xgoal ) then
Return p = Trace Back(x)
else
foreach fi ∈ F(x) do
xnew = Motion Primitive Transition(x, fi )
if x 6∈ O of obstacles then
Q.insert(xnew , Dubins 3D Metric(xnew , xgoal ))
k ←k+1

of an efficient heuristic is critical to improve the quality of a path and avoid the worst case in
terms of time and space complexity.
The search begins with building a tree and assigning a current vehicle state as its root. By
applying all of the motion primitives available at the current vehicle state, a new set of nodes
corresponding to unvisited states is generated at the next depth level. If any point on a motion
primitive is in any obstacle region, that node is discarded from a tree. The costs to a goal for all
new nodes are computed from the 3D Dubins heuristic. The motion node closest to the goal is
expanded again until the state in the goal region is found.
Figure 5.6 and 5.7 show the performance of our search-based planner in 2D or 3D when our
Dubins metric is used as the measure of an expected cost to the goal. The greedy best-first search
keeps exploring the node of the current smallest cost value until the vehicle reaches one of goal
states G. Many more node expansions occur near the goal in an effort to meet the goal direction
as a result of the discrete nature of the sampled motions. All of the motion primitives queued in
the search tree are displayed in order to show which node is a current best estimate towards the
goal at each iteration. Although the found plans are not optimal in terms of the distance, the
small number of total expanded nodes proves the effectiveness of the heuristic function d3D we
propose. Figure 5.6 shows 3D motion plans in the presence of obstacles when the goals and the
UAV are located at different altitudes.
5.3. SEARCH-BASED MOTION PLANNING 75

Figure 5.6: 2D motion planning examples of the greedy best-first search using only lateral motion
primitives in the presence of obstacles. The goal and the UAV are located at the same altitude. All the
node expansions explored in the search tree are also displayed.

Figure 5.7: 3D motion planning examples of the greedy best-first search using both lateral and longi-
tudinal motion primitives in the presence of obstacles. The goal and the UAV are located at different
altitudes.
76 CHAPTER 5. MOTION PLANNING AND CONTROL

5.4 MDP-based Motion Planning under Motion Uncertainty


This section describes a global planning method that finds an optimal feasible path to a goal
for every vehicle state in a 2D obstacle-free space. In particular, this planning method explicitly
considers the uncertainty of a motion primitive due to imperfect flight control and external dis-
turbances such as wind. The goal is therefore to find an optimal motion sequence that maximizes
the probability that the vehicle will reach the target at a desired heading angle.
For a goal region located at the origin of a workspace, we formulate the planning problem as
a Markov Decision Process (MDP) based on the following three factors; an uncertainty motion
model using probability distributions, efficient discretization of the state space, and infinite hori-
zon Dynamic Programming (DP) that computes a sequence of motion transitions that maximizes
the probability of success. Since the DP returns an optimal action for every vehicle state in
a discretized workspace, the best motion requested at any moments during navigation will be
examined in a DP look-up table.

5.4.1 Probabilistic model of a motion primitive

Figure 5.8 shows a set of lateral motion primitives that we use for discrete state transitions in the
MDP. A constant cruise speed Va is maintained in flight and each motion is driven by a change
in the roll angle, which is discretized between -30 to 30 degrees in a 10-degree resolution. The
motion length is proportional to the amount of the corresponding roll angle change.
Let q = (x, y, ψ, φ) be the vehicle state on a 2D horizontal plane. Since only lateral vehicle
motions are considered here, the trim state of a fixed-wing UAV is fully characterized by the
position (x, y), heading angle ψ and roll angle φ. A control input u that causes a transition
between states corresponds to the change of a desired roll angle.
Due to an unpredictable error in the feedback control and imprecise vehicle state estimation,
the next state qk+1 driven by a control input u from a current state qk is not deterministic.
Additional uncertainty in the state transitions may be caused by external disturbances such as
wind gusts and atmospheric turbulence. Actual paths will not always exactly trace the simulated
trajectories in Figure 5.8. Figure 5.9 shows lateral motion primitives executed and collected in
real flight experiments when the same control input is applied, and reveals non-deterministic
responses of state transitions. Since transient and residual errors in the roll controller (5.4) make
the most significant impact on the state transition, we model the uncertainty of a vehicle motion
using a probability distribution of u as follows:

P (qk+1 |qk , u) = fd (qk , u), u ∼ N (φk+1 − φk , σφ ) (5.7)

where σφ = κ|φk+1 − φk | reflects the fact that the greater motion uncertainty occurs as the larger
control input is given. From the feedback control dynamic system fd in (5.1), the control input
5.4. MDP-BASED MOTION PLANNING UNDER MOTION UNCERTAINTY 77

[m]
28
24
20
16
12
8
4
0
φk = −30 ◦ φk = −20 ◦ φk = −10 ◦ φk = 0 ◦ φk = 10 ◦ φk = 20 ◦ φk = 30 ◦

Figure 5.8: A set of lateral motion primitives used in the MDP-based motion planning: each motion
primitive makes a state transition from qk = (x, y, ψ, φ) to qk+1 using the control input u in the feedback
control dynamic system qk+1 = fd (qk , u). At each column, the motions start with the same φk and reach
different final φk+1 (φk and φk+1 are one of -30, -20, -10, 0, 10, 20 and 30 degrees). The motion length is
proportional to the amount of roll angle change, i.e., t = 0.3 |φk+1 − φk | + 0.6 sec at Va = 10.5 m/sec.

u is modeled as normally distributed with the mean of a roll angle change, ∆φ = φk+1 − φk , and
the standard deviation σφ . Figure 5.10 illustrates a probabilistic distribution of qk+1 when finite
discrete samples of a normal distribution u are applied.

5.4.2 State space discretization

The discretization of a state space is based on a grid of points with an equal spacing ∆ in x
and y and angular bins in ψ and φ. For a rectangular workspace bounded by xmax and ymax ,
Ns = xmax ymax /∆2 position states are aligned at the origin. The heading angle ψ is discreted
by dividing [−π, π] with Nψ equally spaced bins where Nψ is a multiple of 4 for purposes of
symmetry. The roll angle φ is dicretized in the same way as done in the lateral motion primitivies
in Figure 5.8, i.e., one of -30, -20, -10, 0, 10, 20 and 30 degrees and Nφ = 7.
Using this discretization, a vehicle state q can be approximated as a discrete state where the
position and heading angle are rounded to a point closest to (x, y, φ) in the corresponding grid
space and the index of a roll angle changes by the control input. The total number of discrete
states is N = Ns Nψ Nφ .
Since the value iteration for an optimization process in the MDP requires O(kN ) time and
memory for each iteration where k is the number of discrete samples of the probabilistic state
transition in (5.7), the discretization resolution ∆ and Nψ should be taken into consideration
for a reasonable computation time and memory. In our implementation, we use ∆ = 2m and
Nψ = 120 for a 3-degree resolution in ψ.
78 CHAPTER 5. MOTION PLANNING AND CONTROL

φk+1 = −30 ◦ φk+1 = 20 ◦ φk+1 = −30 ◦ φk+1 = 30 ◦ φk+1 = −30 ◦ φk+1 = 0 ◦ φk+1 = 10 ◦
[m]
28
24
20
16
12
8
4
0
φk = −20 ◦ φk = 30 ◦ φk = −30 ◦ φk = 0 ◦ φk = 10 ◦ φk = 30 ◦ φk = 30 ◦

Figure 5.9: Uncertainty of a vehicle motion in the execution of lateral motion primitives in the real
world. At each column, the trajectories of a motion primitive are collected from real flight experiments
when the same control input u is applied to change from φk to φk+1 . The state transition from qk to qk+1
corresponding to both ends of each trajectory is not deterministic due to control error and disturbance.

P (qk+1 jqk ; u1 )
= (1 ¡ ®)=2 P (u )
P (qk+1 jqk ; u0 ) = ®

P (qk+1 jqk ; u2 )
= (1 ¡ ®)=2
u1 u0 u2 Ák+1 ¡ Ák

Figure 5.10: Probabilistic model of a state transition in a Markov Decision Process by the corresponding
lateral motion primitive in a discretized state space.

5.4.3 Markov Decision Process

The goal of our MDP-based motion planning is to compute an optimal control u∗ for every state
q in a workspace to maximize the probability of success Ps where Ps (q) = 1 when q is inside a
goal region G. Given a control u for some state q, Ps depends on the response of the vehicle to
the control and the probability of success in the subsequent state. The expected probability of
success is described as Ps (q) = E[Ps (v)|q, u] over a random variable v for the next state. For N
discrete states, the motion planning problem is equal to determining an optimal control ui for
each state i = 1, . . . , N using the discrete approximation and the expansion of the expected value
5.4. MDP-BASED MOTION PLANNING UNDER MOTION UNCERTAINTY 79

to a summation as follows:
N
X
Ps (q) = max E[Ps (v)|q, u] = max Pij (ui )Ps (qj ) (5.8)
u ui
j=1

where Pij (ui ) is the probability of entering state qj given a control ui at current state qi .
A Markov Decision Process (MDP) is a stochastic process on the random variables of state
q, control u and reward function R. The control at each state is purely a function of the state
without explicit dependence on time and past controls. Since the process is stationary, the
expected probability (5.8) becomes a form of the following Bellman equation in the MDP [84]:
N
X

Pij (qj |qi , ui )V ∗ (qj )
 
V (qi ) = max R(qi , ui , qj ) + (5.9)
ui
j=1

where R(qi , ui , qj ) is a reward for the state transition from qi to qj using ui . We set the reward
function R as follows:



 1 : qi in goal

R(qi , ui , qj ) = 0 : qi out of workspace (5.10)



−0.001 − α |φ − φ | − α |φ | : Otherwise
1 j i 2 i

where α1 and α2 are user-defined constants. The reward function is designed to return the higher
value as the smaller changes in the roll angle or the motions closer to a level flight (φ = 0) are
made. The optimal policy π ∗ at the state qi is given as
N
X
π ∗ (qi ) = argmax R(qi , ui , qj ) + Pij (qj |qi , ui )V ∗ (qj )
 
(5.11)
ui
j=1

We solve this dynamic programming problem defined by the Bellman equation (5.9) by the
value iteration, which iteratively updates Ps for each state by evaluating (5.8). This generates
a DP look-up table containing the optimal control ui and the probability of success Ps for i =
1, . . . , N . To improve iteration speed, the sparsity of the matrices Pij (u) ∈ RN ×N is exploited
since each row of Pij has only k nonzero entries where k  N due to the spatial vicinity of a
state transition. Hence, only accessing nonzero entries of Pij (u) during computation requires only
O(kN ) rather than O(N 2 ) time and memory at each iteration [85]. The value iteration terminates
when the maximum value change over all states is less than a threshold (we use 10−4 ).

Figure 5.11 shows several examples of an MDP optimal path using the lateral motions in
Figure 5.8 and the probabilistic model in Figure 5.10. Each plot shows three paths that start at
the same position and heading angle but different roll angles (0, 30, -30 degrees, respectively).
80 CHAPTER 5. MOTION PLANNING AND CONTROL

1
0
2 1 1
0
0
2 2

0
2 1
1
0
2

0
2
1

Figure 5.11: MDP-based optimal motion plans in an obstacle-free 2D space: each plot shows MDP paths
starting at three different roll angles (φ0 = 0◦ , φ1 = 30◦ and φ2 = −30◦ ). The color of a path segment in
the plan represents the magnitude of a final roll angle of the corresponding lateral motion primitive (the
red is |φk+1 | = 30◦ , green is |φk+1 | = 20◦ , blue is |φk+1 | = 10◦ , and black is |φk+1 | = 0◦ ).

Shortest(208.2m)
Shortest(122.4m)

MDP(134.8m) MDP(222.5m)

Figure 5.12: Comparison between the stochastic MDP optimal path maximizing the probability of success
Ps and the deterministic shortest path ignoring motion uncertainty. The MDP path is slightly longer than
the shortest path but more probable to enter the gate.
5.5. TARGET VISIBILITY CONSTRAINT 81

Due to the vehicle dynamics, distinctive routes to the goal are generated depending on the starting
roll angle.
We set a goal region G to a set of states satisfying −10 ≤ x ≤ 0m, |y| ≤ 3m, |ψ| ≤ 8◦ , and
|φ| ≤ 10◦ for a gate located at the workspace center and oriented toward the positive x-axis. The
number of discrete state transitions for each motion primitive is set to Np = 3 such that three
control inputs u0 , u1 , and u2 are sampled at 0, −1.0σφ and 1.0 σφ , respectively when κ = 0.1 in
(5.7). Their probabilities are computed by integrating the corresponding area under the normal
curve. We set the workspace by xmax = ymax = 100m and used discretization parameters ∆ = 2m
and Nψ = 120. The total number of resulting discrete states is N = 2, 100, 000. The DP problem
was implemented in C++ and tested on a 2.17GHz Intel PC. The computation time for each
value iteration was 55.3 sec and 50 iterations were needed until the maximum value change was
less than 10−4 . Overall, the total time to compute DP lookup table was 46.1 min.
Figure 5.12 compares an optimal path maximizing the probability of success Ps with the short-
est path ignoring motion uncertainty. A deterministic shortest path was computed by replacing
the reward function R in (5.10) with the motion length and using a single state transition at
the mean of the normal distribution. The MDP path is slightly longer than the corresponding
shortest path but is more probable to enter the gate because less aggressive turns are involved
near the gate.

5.5 Target visibility constraint

Suppose that a camera in a fixed-wing UAV has a limited field-of-view lens and is aligned down-
ward to detect a target on the ground. The camera’s viewing area for a surrounding environment
is bounded and subject to the airplane’s pose. If a motion plan consists of tight turns in high
roll angles near the target, the target is less likely to be exposed to the camera before the vehicle
passes through the target. It is not possible to achieve seamless target observation unless special-
ized units such as a pan-and-tilt gimbal system or omnidirectional lens are additionally attached
to the camera.
If this limited target visibility is ignored in motion planning when the confidence on a target
state is low, the vehicle is more likely to miss a target because it does not have a sufficient
time to observe the target and respond to a refined target pose produced from additional target
observation.

5.5.1 Minimum target observation time

To guarantee a continuous target-in-view approach with an increased confidence on the target,


we define a visibility goal region Gv that satisfies tv,min ≤ tv ≤ tv,max for a non-discontinuous
82 CHAPTER 5. MOTION PLANNING AND CONTROL

Camera view

Camera view

Figure 5.13: MDP motion plans π v to the visibility goal region Gv that is used as an intermediate goal
to ensure a sufficient target observation time tv . Every state in Gv satisfies 2.5 ≤ tv (π g (qi , G)) ≤ 3.2 sec
for the original goal region G when Va = 10.5 m/sec and G is located around 40m ahead of the target gate.

Figure 5.14: Comparison between MDP motion plans when the target visibility constraint is ignored
(π g ) or considered (π vg ). The red segment on each path indicates the states that the target is visible and
the number is the target observation time tv . The plan π vg that arrives at the goal by way of Gv is longer
than π g but guarantees the minimum tv . The target visibility is computed when the vehicle’s altitude is
18m and the camera’s field of view and downward angle are 70 and 40 degrees, respectively.
5.6. MOTION PLANS IN 3D 83

target observation time tv until goal completion. Given a look-up table of the MDP plan π g in
Section 5.4.3 which is computed for an original goal region G around the target object g, we can
find the following visibility goal region Gv by investigating π g for every state:

Gv = {qi | tv,min ≤ tv (π g (qi , G)) ≤ tv,max , V (qk , g) = true for all qk in π g (qi , G)} (5.12)

where V (qk , g) denotes the visibility of the target g at the vehicle state qk . The path π g starting
at any state in Gv always sees the target for the time longer than tv,min .
Providing Gv as a set of intermediate goals, we design a two-step motion planning scheme in
order to impose a target observation time tv on a resulting motion plan. Firstly, a new MDP
optimal plan π v is computed for Gv using the same MDP framework but the goal region. Then,
two individual paths (π v and π g ) are combined into a final motion plan π vg via an intermediate
goal qm in Gv . In other words, π vg (qs , G) = π v (qs , qm ∈ Gv ) + π g (qm , G) where qs is a starting
state and qm is a final state of π v .
Figure 5.13 shows the visibility goal region Gv which guarantees the minimum target observa-
tion time tv,min = 2.5 sec at the cruise speed Va = 10.5 m/sec, and is located around 40m ahead
of the goal. In Figure 5.14, we compare the path produced when the visibility constraint is either
ignored (π g ) or considered (π vg ). The path length of π vg is typically longer than that of π g in
order to provide a minimum target observation time in π vg .

5.6 Motion Plans in 3D

The MDP framework introduced in the previous sections can be extended to 3D by taking lon-
gitudinal motions into account. The state dimension increases by one since the state becomes
q = (x, y, z, φ, ψ) in 3D. In addition to the lateral motions in Figure 5.8, we use the longitudinal
motion primitives presented in Figure 5.3 whose time duration (tf ) of a pitch reference profile is
fixed to 2.4 sec for altitude change.
Figure 5.15 shows two examples of 3D optimal motion plans generated by the MDP-based
motion planning. The state space is discretized by ∆x,y = 2m, ∆z = 3m, ∆φ = 10◦ and ∆ψ = 5◦
in a relatively small 3D workspace of 60 × 60 × 24 meters. The number of discrete states is
N = 18, 144, 000 and the total computation time for a DP lookup table was about 12 hours.
Note that the MDP demands significant computation time and memory space when the
number of states is large. Thus, in previous sections, we limited the MDP-based planning to
lateral maneuvers on a horizontal plane and the altitude was controlled separately. Nonetheless, a
deliberate design of state-space discretization (e.g., multi-resolution approach) or an appropriate
choice of a workspace range will help reduce the number of states and accelerate the MDP
computation for a larger 3D workspace size.
84 CHAPTER 5. MOTION PLANNING AND CONTROL

(a) (b)

Figure 5.15: MDP-based optimal motion plans in a 3D space: Contrary to Figure 5.11, the gates are
placed at different altitudes from the vehicle’s current altitude (∆Z = 10m in (a) and ∆Z = −15m in
(b)). The red segments on each motion plan correspond to longitudinal motion primitives which cause
altitude changes.

5.7 Entering a gate

When the vehicle is close to a target with an appropriate heading angle, the target completion
task requires a fine-level control that can lead the vehicle to enter a gate successfully. Since the
resolution of the motion primitives in Section 5.2 are not high enough for control at the desired
precision level, the navigation is switched to another algorithm near the gate.
Trajectory tracking requires the vehicle to be in a particular location at a particular time
and thus may result in significant problems for small UAVs unless a tracking controller properly
accounts for disturbances. Hence, we instead use a straight path following control proposed by
Nelson et al. [86], which guides the vehicle to remain on the path without the notion of time and
is thus proven to be robust to wind disturbances.

5.7.1 Vector field along a straight line

Without the loss of generality, suppose that a gate is located at the origin and directed at the
y-axis as shown in Figure 5.16. The goal is to follow the straight line path aligned to the gate
orientation and to lead the vehicle to a successful gate passing. This path following method is
based on the construction of a vector field surrounding the path to be followed. The vectors of
this field provide heading commands to guide the vehicle toward the desired path.
Let x be the lateral distance of the vehicle from the path and let φd be the desired heading
of the vehicle. The objective is to construct the vector field so that the vehicle is directed
perpendicularly to the path (φd = π/2) when x is large, and the vehicle is directed along the
5.7. ENTERING A GATE 85

ψd
30
x

20

10
ψ

Gate
−10

−20

−30
−50 −40 −30 −20 −10 0 10 20

Figure 5.16: Entering a gate: vector field for a desired heading angle ψd is generated along the straight
line aligned to a gate when the vehicle is sufficiently close to the gate with a proper heading angle (k = 0.02
in ψd = − tan−1 (kx)).

path (φd = π/2) as x approaches zero as follows:

ψd = − tan−1 (kx) (5.13)


φd = −Kp (ψd − ψ) (5.14)

where k is a positive constant that handles the rate of the transition from π/2 to zero. A small
value of k produces a long and smooth transition in the desired heading. In our autopilot, a
heading-hold loop for ψd is implemented by the control of the roll angle to follow φd which is
proportional to the heading error in (5.14).
86 CHAPTER 5. MOTION PLANNING AND CONTROL
Chapter 6

Air-Slalom Experiment

In this chapter, we will describe the UAV system that has been developed for simulation and
experiments, along with additional algorithms that are needed for autonomous navigation, such
as target detection and vehicle and target state estimation. Experiment results on the air-slalom
task will be presented when a single gate or multiple gates are arranged on the ground.

6.1 UAV system

Our UAV system development is targeted for a small-size and low-cost platform that is deployable
and expendable with a high budget-performance ratio. All of the components employed in the
system are commodity-level products and readily available in market. It is expected that our
system architecture and hardware would be easily transferrable to other airplane platforms.
The UAV system consists of the following two main parts linked by wireless serial communica-
tion: (1) an airplane system which is a fixed-wing model airplane equipped with onboard sensors
and processors for flight control and vision processing, and (2) a ground station which not only
monitors flight status but also provides sufficient computation capabilities for state estimation
and motion planning.

6.1.1 Hardware and Software Architecture

Figure 6.1 depicts the architecture of the UAV system hardware and communication system. The
fixed-wing airplane is based on a conventional RC model that has a 1.2m wingspan, three control
inputs (thrust, rudder and elevation), and is typically sold for hobbyists at the beginner level.
The maximum payload is 250g and is almost the same as the total weight of all components
including onboard sensors, autopilot, onboard computer and batteries. The maximum flight time
is around 10 minutes due to limited battery capacity and high payload. The autopilot combines

87
88 CHAPTER 6. AIR-SLALOM EXPERIMENT

Table 6.1: Specifications of low-cost commercial off-the-shelf components in the UAV system

Component Manufacturer & Model Specifications


Airplane Hobbyzone SuperCub 3-channel (thrust, rudder, elevation), 1.2m wingspan
Battery #1 Cellpro CP0950-3S Lithium polymer, 3-cell, 11.1V, 900mAh
Battery #2 Cellpro CP0300-2S Lithium polymer, 2-cell, 7.4V, 300mAh
Barometer BMP085 0.03 hPa accuracy, MEMS
IMU Analog Devices ±2g, ±200◦ /sec, tri-axial, MEMS
GPS U-Blox LEA-5H 4Hz position update, helix antenna
Camera FireFly MV captured in 320 × 240 at 15FPS, 70◦ FOV lens
Autopilot Ardupilot Mega 1280 16MHz, 8 11-bit A/D, 4 serial, 3 PWMs
Onboard computer Gumstix Overo Fire OMAP3530, 720MHz, 512M mem, 8GB flash, Summit
Communication XBee PRO 900 Serial comm at 38.4kbps, 900 MHz
Ground station Alienware M15X 1.73GHz, Quad core

Real World (Model Airplane)


Simulated World (Virtual Reality)

Sensor Airplane
Emulation Dynamic Model

Rendering Graphical 3D
Camera Machine Urban Model
Onboard computer

15 FPS
Image acquisition
USB Image processing

2W/30g

Autopilot
50 Hz
IMU
A/D 0.5W/20g
Serial
50 Hz
Wireless Serial
Pressure A/D
(Altitude)

Computer
Serial

1W/10g
GPS
Motion
4 Hz
Wireless Serial
Serial
planning
Serial 50 Hz
State
SERVO
0.2W/25g
AUTO
RC Receiver
Estimation

MANUAL
RC Transmitter

Ground Station
Figure 6.1: The architecture of UAV hardware and software systems. Wireless communication to the
ground station can be switched between the real airplane and the simulator.
6.1. UAV SYSTEM 89

Figure 6.2: UAV testbed platform and onboard components

GPS, strap-down MEMS IMU, and barometer sensors into PID feedback flight controls. The
detail specifications of the airplane platform and onboard sensors are listed in Table 6.1.
The execution of navigation algorithms is distributed over on-body and off-body computation.
Since our wireless communication throughput is not sufficient for realtime image delivery, vision
processing runs on the onboard computer and its core output is then sent to the ground station
in order to be combined with other sensor data for vehicle and target state estimation.
There exist two main switches in the system architecture. One is for the data link connected to
the ground station and switched between the real world and the simulated world. The navigation
algorithms at the ground station are completely isolated from where the switch is connected.
The other is for the airplane control which selects an autonomous mode or manual mode for
emergency interception.

6.1.2 UAV Simulator

A graphical UAV simulation environment has been developed in order to evaluate the performance
in many different operational scenarios with ground truth. The dynamic model in the simulator
was developed at the MAGICC lab at BYU [81] and represents a fixed-wing that has a 1-5 foot
wing span and is powered by an electric motor attached to a fixed-pitch propeller. Some model
parameters were hand-tuned for our airplane after flight behaviors in outdoor flight were compared
with the simulated model. Several 3D graphic models of urban scenery are integrated into the
simulation environment for camera emulation. Irrlicht Engine is used for realtime rendering and
easy loading of commercially available graphic file formats.
In the simulator, all sensor emulations and vehicle dynamics updates are based on the simu-
lated clock. Since the simulator runs on a computer completely separated from the ground station
in Figure 6.1, the clock of the simulated world runs consistently so that periodic sensor emulation
take places. Otherwise, unpredictable time delays due to additional heavy computation may
occur, which will corrupt the timing of the simulated sensors.
90 CHAPTER 6. AIR-SLALOM EXPERIMENT

Ground gate geometry

(w = h = 6m)
Virtual
air gate #1
Virtual
air gate #2

Ground gate #1

Ground gate #2

Figure 6.3: The experimental setup for the air slalom task using virtual gates in the air, each of which
is conceivable from a set of colored square objects on the ground.

6.2 Virtual gate and target detection

Our experimental setup for the air slalom task is illustrated in Figure 6.3. Since it is difficult to
install real gates in the air maintained at a certain altitude, we instead construct virtual gates
that can be easily conceivable from real ground targets. In the experiment, two virtual gates are
implemented by placing visual targets on the ground.

Virtual gate:

Each virtual gate in Figure 6.3 is constructed using four square-shaped objects placed on
the ground. These ground objects are identical in size and consist of one blue square and three
red squares. The geometric shape between the four squares is arranged as a rectangle and the
distance between the square centers is set to 6m (w = h = 6m). The position of the virtual gate
is equal to the rectangle center and its zero orientation corresponds to when the blue square is
at a bottom-left corner.

Gate detection and association:

The ground target objects are distinctly colored so that simple color matching can easily detect
them in images. Firstly, we prescribe color regions for the red and blue targets, respectively, in
the HSV color space. The thresholding on each channel can produce a binary image for matched
color areas. Secondly, a list of connected regions is extracted from this binary image by an optimal
6.2. VIRTUAL GATE AND TARGET DETECTION 91

Algorithm 6.1: Gate detection and association


{pk1 , pk2 , pk3 , pk4 } = Gate Detection And Association(I, xcam )
Set bounded regions Rhsv for red and blue targets and Amin
IHSV = convert color space(IRGB )
IBIN = thresholding(IHSV , RHSV )
ILABEL = two pass connected component labeling(IBIN )
(p1b , . . . , ppb , p1r , . . . , pqr ) = mass center(ILABEL , Amin )
foreach a blue point pib do
foreach gates in xkg do
foreach three red points (p1r , p2r , p3r ) do
rj = (pj1 , pj2 , pj3 , pj4 ) = set anti clockwise order(pib , p1r , p2r , p3r )
bcam = camera pose homography(rj , xkg , w, h)
x
err(k,j) = kxkcam − x bcam k
(k ∗ , j ∗ ) = argmink,j (err(k, j))

(a) Captured image (b) Thresholding in HSV space (c) Connected blobs (d) Rectangular gate shape

Figure 6.4: A single or multiple-gate detection by color matching and blobing

two-pass connected-component labeling algorithm (Wu et al. [87]), which is known as one of the
fastest labeling methods. Finally, for the blobs larger than a minimum area Amin , mass centers
of blue or red blobs (p1b , . . . , ppb , p1r , . . . , pqr ) become candidates for rectangular corner points of
each ground gate.
Suppose that n gates are already initialized and their current pose estimates are (x1g , . . . , xng ).
To correctly associate each mass center with a corresponding rectangular corner in the presence
of outliers, we test all possible rectangles that can be drawn by {pib } and {pir }. Each rectangle
candidate consists of one blue and three red points in an anti-clockwise order on an image. From
a known rectangular shape (w, h) and xkg , the hypothesis error for the k-th gate is the difference
92 CHAPTER 6. AIR-SLALOM EXPERIMENT

Gyroscope Vehicle Motion Prediction

Measurement Update

Position, speed, heading GPS

Altitude Barometer

Attitude Accelerometer

Line segments
Camera

Target points

Figure 6.5: The prediction and update step of an Unscented Extended Kalman filter for vehicle and
target state estimation based on GPS, IMU, barometer and camera.

between a current camera pose xkcam and the pose x


bcam estimated by the planar homography,
which is computed by 2D-to-2D point correspondences between the four corners on the image
plane and those on the ground plane [53]. For each blue point pib , a minimum error reveals which
gate is associated with the best hypothesis (pk1 , pk2 , pk3 , pk4 ) where pk1 = pib . Next, this result
proceeds to the measurement update in the target state estimation described in the following
section. See Algorithm 6.1 and Figure 6.4 for the more detailed procedure.

6.3 Vehicle and target state estimation

The estimation of vehicle and target states for the air-slalom task is based on the sensor fusion
of low-grade and lightweight onboard sensors such as GPS, IMU, barometer and camera. We
use an Unscented Kalman filter (UKF) [88] to combine a predicted vehicle motion and the sensor
measurements in nonlinear equations. For a Gaussian state distribution, the posterior estimate
produced by the UKF is accurate to the third order. One main benefit of the UKF is that it is
derivative-free and thus the filter does not require Jacobian matrices to be computed at each step.
Although the computational load of the UKF increases compared with other Kalman filters, it is
amenable on a general desktop computer.
The state estimate x in the UKF consists of nine states for the vehicle and three states for
6.3. VEHICLE AND TARGET STATE ESTIMATION 93

each target gate, i.e., x = (xv , xg ) = (x, y, z, φ, θ, ψ, v, α, β, x1g , yg1 , ψg1 , . . . , xng , ygn , ψgn ). The
vehicle state is denoted by the position pv = (x, y, z) and orientation ov = (φ, θ, ψ), vehicle
speed v, angle of attack α and side slip angle β. The target state xig for the i-th gate is denoted
by the position (xig , ygi ) and orientation φig on the ground plane. The state space dimension is
(9 + 3n) when n is the number of target gates. The UKF in Figure 6.5 alternates the following
two distinct phases, i.e., vehicle motion prediction and measurement update:

Vehicle motion prediction:


The prediction phase uses the state estimate from the previous timestep to produce an esti-
mate of the state at the current timestep. The following continuous-time prediction equations in
the UKF represent a non-accelerating and non-acrobatic maneuver of a fixed-wing airplane. Two
angles (α and β) account for the discrepancy between a true vehicle moving direction and the
fuselage orientation which are caused by lift force and wind disturbance. A strap-down gyroscope
output ω g is the only external input in the prediction step. All the ground gates are assumed to
be stationary.

 0  
x v
   
y  = R(φ, θ + α, ψ + β) 0 + wp (6.1)
   
z 0
 0
φ
 
 θ  = Ω(φ, θ)ω g + wo (6.2)
 
ψ
 0
v
 
α = 0 + wv (6.3)
 
β
 0
xi
 g
 y i  = 0 + wg , for i = 1, . . . , n (6.4)
 g
ψgi

where
   
c c −cφ sψ +sφ sθ cψ sφ sψ +cφ sθ cψ 1 sφ tθ cφ tθ
 θ ψ   
R(φ, θ, ψ) = cθ sψ cφ cψ +sφ sθ sψ
 −sφ cψ +cφ sθ sψ  , Ω(φ, θ) = 0
  cφ −sφ   (6.5)
−sθ sφ cθ cφ cθ 0 sφ /cθ cφ /cθ

and wp wo wv and wg are Gaussian process noises.


94 CHAPTER 6. AIR-SLALOM EXPERIMENT

Measurement update:
In the update phase, the current a priori prediction of the state x is combined with current
measurements of the onboard sensors listed in Table 6.1 in order to refine x to the a posteriori
state estimate. The following update equations are used to compute the predicted measurement
from the predicted state x
b in the UKF.
• GPS (position, speed, heading):
An up-to-date GPS receiver module provides many features of navigation states from satel-
lites, which commonly includes position, velocity, precision timing and level of data accu-
racy. Among available GPS data, we select the measurements for a 2D position, ground
speed Vground , and ground course from GPS velocity Vgps .
   
x xgps + x0
   
 y   ygps + y0 
hgps (x, y, ψ, β, v) = 
  =
  (6.6)
ψ + β 
  atan2(V , V )
y,gps z,gps 
v Vground

Since the U-Blox GPS module performs a different satellite signal processing depending
on an internal operating mode, it is important to set an appropriate mode for aerial 3D
motions. The update rate of the GPS is 4Hz.

• Barometer (altitude):
The digital pressure sensor reports the pressure and temperature in steps of 0.01hPa and
0.1◦ C. Once a true pressure p is computed after temperature compensation, the following
barometer formula describes how the absolute pressure of the air changes with altitude z.

z + z0 5.255
 
hbaro (z) = p = ps 1 − (6.7)
44330

where ps is the pressure at sea level (e.g., 1013.25hPa) and the altitude change ∆z = 1m
corresponds to ∆p = 0.12hPa at p = ps . The update rate of the barometer is 50Hz.

• Accelerometer (attitude):
An accelerometer measurement am is the sum of gravity ag and aerodynamic forces adyn
like lift, thrust and drag. Since it is almost impossible to eliminate adyn completely from
am , we estimate ag by removing the centrifugal acceleration acen , which is a main factor in
adyn during banked turns.

ag = am − adyn (6.8)
bg = am − acen ≈ am − ω gyro × [ v, 0, 0 ]>
a (6.9)
6.3. VEHICLE AND TARGET STATE ESTIMATION 95

An estimated gravity a
bg = (ax , ay , az ) provides the vehicle attitude as follows.
!  
φ atan2(a ,
y za )
hacc (φ, θ) = = q  (6.10)
θ atan2(−ax , a2y + a2z )

Similar approaches to cancel adyn out can be found in the nonlinear complementary filter [4]
and direction cosine matrix [5]. The update rate of the IMU is 50Hz.

• Camera (line segments):


The update equation (3.13) for a gravity-related line segment measurement `i is already
presented in the visual/inertial attitude estimation, i.e.,
 ∗
vby − my

−1
hline (φ, θ) = tan (6.11)
vbx∗ − mx
where m is a line midpoint and v∗ is a predicted vanishing point of `i . The prerequisite
line classification for vertical and horizontal edges is described in Chapter 3.4.

• Camera (target points):


As shown in Figure 6.4(d), the measurement of the i-th target gate is composed of four
corners (pi1 , pi2 , pi3 , pi4 ) of a rectangle in a normalized camera coordinate. The blue square
corresponds to pi1 and the other red squares are listed in an anti-clockwise direction. The
following update equation is the perspective projection of these rectangular corners in Pr
which are located on the x-y world plane.
   
pi1 P1g
 i  2
p2  > >
Pg 
htarget (pv , ov , xig ) = 
pi  ≡ [Rc , −Rc Tc ] P3 
   (6.12)
 3  g
p4i P4g

   
−w −w w w xig
1  k i k

 i

Pr =  h −h −h  and Pg = R(ψg )Pr +  yg  for k = 1, . . . , 4.
h  (6.13)
2
0 0 0 0 0

where pv = (x, y, z), ov = (φ, θ, ψ), xig = (xig , ygi , ψgi ) and ≡ denotes the equality in
homogeneous coordinates. Pkr is the k−th column of Pr . Currently the lightweight onboard
microprocessor limits the update rate of the camera measurement to 7.5Hz.

The orientation drift that occurs in the gyroscope integration (6.2) at the prediction phase is
corrected by the GPS ground course, static gravity and line segments in the update phase. For
the altitude estimation, we use the barometer rather than the GPS vertical position in (6.6) and
96 CHAPTER 6. AIR-SLALOM EXPERIMENT

(6.7). This is because the GPS vertical accuracy is relatively poorer than the GPS horizontal
accuracy and vulnerable to a systematic error caused by unfavorable satellite locations. While
the pressure altitude from the barometer is more noisy than the GPS altitude, the barometer
shows better precision and repeatibility than the GPS in measuring the altitude.

6.4 Experimental results

We designed two different air-slalom task scenarios for the demonstration. The first is a single-gate
case in Figure 6.6 where only one gate is placed in the workspace but a desired gate orientation
is repeatedly rotated 90 degrees for the next entrance when the gate entrance is completed. The
second is a multiple-gate case in Figure 6.11 where two gates are placed and their orientations
are unchanged and fixed by the target color.
In both experimental tasks, we used separate motion planning schemes for lateral and longitu-
dinal maneuvers, respectively. For lateral maneuvers moving toward a gate, the MDP planner in
Chapter 5.4 was used with taking the target visibility constraint into account. When the vehicle
entered the visibility goal region, the straight line following in Chapter 5.7 began to guide the ve-
hicle to the gate. For longitudinal maneuvers, the altitude zg for each virtual gate was not strictly
specified but the flight altitude was maintained between the upper and lower bounds to prevent a
crash. At the end of each lateral motion primitive, the vehicle altitude was checked for its desired
bound. In the case where the altitude was out of this bound, an appropriate longitudinal motion
primitive was immediately executed to push the vehicle back to a safe altitude.
The experiment site at which we performed the slalom task was an obstacle-free and flat en-
vironment. Since no architectural structures were present in the workspace, gravity-related line
segments were not available for attitude estimation. Hence, the attitude drift in the UKF was
mainly reset by an approximate static gravity in (6.10) from the accelerometer after centrifugal
accelerations were eliminated. Although this fact may increase the uncertainty of motion prim-
itives due to the degraded accuracy of attitude estimation, the experiments demonstrate that
MDP-based motion plans effectively cope with motion uncertainty.

Gate search and initialization:

The UAV first started to navigate around the workspace in a predefined pattern to search
and identify all the gates. The first reliable reading of the GPS and barometer set the initial
vehicle position (x0 , y0 , z0 ) in (6.6) and (6.7) of the UKF. Once four square-shaped ground targets
(pk0 , pk1 , pk2 , pk3 ) were detected for the k-th gate, the gate state xkg was initialized by four points in
the world coordinate, each of which is an intersection point of the ground plane and the camera
ray computed from a current vehicle state xv and target measurement pki . The camera frame
6.4. EXPERIMENTAL RESULTS 97

rate was 7.5 Hz. Figure 6.6 displays the trajectories of target states estimated during the gate
search until the targets disappeared from the image. The estimation result shows that 10 to 15
times target measurements were sufficient to stabilize the target state (gate posture) in the filter.
The initial position deviation was less than 3m, which is relatively small because the targets were
detected during level flight where the vehicle attitude would be the most accurate. If the targets
were detected during a banked turn, the initial deviation could increase due to a larger attitude
error in the filter.

6.4.1 Single-gate air slalom

The task scenario for a single gate is that the UAV repeatedly passes through the same gate
in different orientations incremented by 90 degrees each time the gate entrance is completed.
Figure 6.7 shows the navigation trajectory and the gate posture estimated in the UKF during
the single-gate air slalom. The route shape generated by MDP motion plans produces a four-leaf
clover after five gate completions. The images at the top verify the gate orientations which differ
by 90 degrees at each entrance. The gate was composed of four identical red squares but one of
them (labeled as 1) was rotated 45 degrees with respect to the others.
The jittering of the vehicle position that occurs each time the UAV approaches the gate is
caused by a large innovation in the target measurement update in the UKF. When the camera
starts to see the ground targets again after turning back, the expected target measurements are
less likely to be close to the real ones due to attitude estimation error, which could be large except
during a level flight. This error is thus distributed over the vehicle position and heading in the
UKF until it is eliminated by a sufficient number of target observations.
Figure 6.8 shows a progressive change of the MDP optimal motion plan as the vehicle ap-
proaches the gate. At the end of the execution of each lateral motion primitive, a new optimal
plan to the gate was found from a precomputed MDP look-up table that guarantees a sufficient
target observation time tobs . The visibility goal region was set to 2.5 ≤ tobs ≤ 3.5 sec in Chap-
ter 5.5. In both examples, the first trial to enter the gate failed due to a large deviation from
the planned path near the gate and a turning-back motion was then generated to provide a new
route. The second trial successfully drove the vehicle to the gate.
Figure 6.9 shows the performance of the straight line following in Chapter 5.7 that enables
the vehicle to enter the gate with a desired heading. The vector field for ψd is determined by the
distance to the straight line in (5.13). In the simplified dynamics of a fixed-wing UAV, the control
of a vehicle heading ψ can be considered to be an integrator response of the roll angle φ. Hence,
the regulation output for the desired heading angle using the simple controller φd = −Kp (ψd − ψ)
in (5.14) becomes slow and phased off.
Figure 6.10 shows the vehicle speed, altitude, roll and pitch angles regulated by the autopilot
98 CHAPTER 6. AIR-SLALOM EXPERIMENT

1.00 sec 2.07 sec 3.00 sec 4.07 sec

5.00 sec 6.07 sec 7.00 sec 8.07 sec

[m]
40

20

−20

−40
−40 −20 0 20 40 60 80 [m]

[m]
20
x1g
15

10
yg1
5

0
1 2 3 4 5 [sec]

Figure 6.6: Trajectory of the target state estimation from consecutive ground target measurements in
the UKF during the gate search. Two gates are detected and each target is observed for about 4 seconds
in a 7.5 camera frame rate. (Middle) the gray and red squares for each gate indicate initial and final target
poses, respectively.
6.4. EXPERIMENTAL RESULTS 99

Entrance #1 Entrance #2 Entrance #3 Entrance #4

1
4 2
2 1 3 3
4
2
3 3 4 1
4 1 2

[m]
100

50

#4
#3

0
#1
#5
#2

−50

−100
−100 −50 0 50 100 [m]

Altitude [m]
30
20
10
0

Figure 6.7: Navigation result on the single-gate air slalom task: only one gate is placed in the workspace
but a desired gate orientation for the next entrance is repeatedly rotated 90 degrees whenever the gate
entrance is completed. The gate entrance is completed five times. The images at the top show the gate at
each entrance. All ground targets are red and their first square is rotated by 45 degrees.
100 CHAPTER 6. AIR-SLALOM EXPERIMENT

20.0 sec 22.4 sec 23.4 sec 28.2 sec 33.2 sec 34.0 sec

37.3 sec 39.4 sec 42.0 sec 44.2 sec 45.3 sec 47.2 sec

50.2 sec 53.4 sec 55.2 sec 57.5 sec 59.8 sec 61.9 sec

(a)

2.3 sec 4.7 sec 7.1 sec 9.6 sec 13.0 sec 15.4 sec

17.8 sec 20.2 sec 22.7 sec 24.8 sec 26.5 sec 31.7 sec

33.5 sec 35.5 sec 37.4 sec 38.5 sec 39.7 sec 40.5 sec

(b)

Figure 6.8: MDP motion planning results in the single-gate air slalom task. At the end of execution of
each motion primitive, a new optimal plan to the gate is found from the precomputed MDP look-up table
and sent to the autopilot. In both results, the first trial to approach the gate fails due to insufficient target
observation time and the entrance is successful at the next approach. The gray area indicates the camera’s
field-of-view. The color of each motion primitive in the planned path represents its final roll angle (30
degrees in red, 20 in green, 10 in blue, and 0 in black).
6.4. EXPERIMENTAL RESULTS 101

20 20
UKF UKF
ψd GP S ψd GP S

10 10

0 0

Gate Gate
−10 −10

−20 −20
−50 −40 −30 −20 −10 0 [m] −50 −40 −30 −20 −10 0 [m]

40 40
ψd ψ ψd ψ
20 20

0 0

−20 −20

−40 −40
0 1 2 3 4 5 [sec] 0 1 2 3 4 5 [sec]

40 40
φd φ φd φ
20 20

0 0

−20 −20

−40 −40
0 1 2 3 4 5 [sec] 0 1 2 3 4 5 [sec]

(a) (b)

Figure 6.9: Two gate entrance results performed by in the straight line following in the air slalom task.
In both results, the vector field for the desired heading angle ψd is generated by ψd = − tan−1 (ky) in
(5.13) and the desired roll angle is given by φd = −Kp (ψd − ψ) when k = 3 and Kp = 1.0.
102 CHAPTER 6. AIR-SLALOM EXPERIMENT

roll [deg]
40

20

0
φd
−20
φ
−40
80 100 120 140 160 180
[sec]
pitch [deg]
10

−10

−20 θd

−30 θ

−40
80 100 120 140 160 180
[sec]
altitude [m]
20

15

10
zd
5
z
0
80 100 120 140 160 180
[sec]

speed [m/sec]
15
Va,d
Va
10

5
80 100 120 140 160 180
[sec]

Figure 6.10: Autopilot control outputs to generate lateral and longitudinal motion primitives during
the single-gate air slalom task in Figure 6.7. The desired roll angle φd provided by the MDP planner is
switched between -30 to 30 degrees in a 10-degree resolution and most of the lateral motions are right
banked turns. The crooked φd in a high frequency indicates the straight line following mode for the gate
entrance. The peaks in the desired pitch angle θd occur when altitude recovery is required.
6.4. EXPERIMENTAL RESULTS 103

feedback controllers in (5.3) to (5.5). The desired cruise speed Va,d was constantly set to 10.5
m/sec. The desired angles, φd and θd , were provided by a lateral or longitudinal motion primitive
chosen in the MDP motion plans. For lateral maneuver control, the discrete desired roll φd
switched between -30 to 30 degrees in 10-degree resolution. Dominantly positive φd indicates
that most of lateral motion primitives used for the single-gate slalom were composed of right
banked turns as shown in Figure 6.7. The crooked φd in a high frequency corresponds to the
moments when the vehicle was guided by the straight line following in order to enter the gate
precisely and φd was given by a heading error. For longitudinal maneuver control, rather than
maintaining a constant altitude, a new desired altitude zd was set as the current altitude at
the start of each motion primitive. The desired pitch, θd = −Kp,z (zd − z), was then dedicated
to regulate the altitude error. The peaks in the desired pitch angle θd occurred when altitude
recovery was explicitly required.

6.4.2 Multiple-gate air slalom

The same navigation algorithms used in the single-gate air slalom were also applied for the
multiple-gate air slalom task. Two gates were placed in the workspace and their orientations
were unchanged and identifiable by a blue ground target. Since both gates were identical in
shape, the first found gate was labeled x1g and needed to be passed through first. The search and
identification process for both gates have already been shown in Figure 6.6.
In Figure 6.11, an entire navigation trajectory to enter the two gates sequentially was com-
pleted twice. The images at the top show the gate observations at the corresponding moments
indicated in the plot by the number. The minimum turning radius was about 20m when a lateral
motion primitive maintained the 30-degree roll angle. The relative posture between the gates was
estimated to be (27.5m, −20.1m) and 45.2 degrees. This slalom setting led our optimal motion
planning to eventually produce a figure-eight route shape.
The MDP motion planner achieved a successful autonomous guidance to enter the gates mul-
tiple times. Figure 6.12 shows the progress of the motion plan during the first slalom completion.
Entering the first gate was repeated three times because position and heading deviations from the
planned path were larger than the motion uncertainty modeled in the MDP due to an unexpected
wind gust. The entrance to the second gate was smoothly accomplished.
Figure 6.13 shows the responses of the autopilot feedback controllers during the multiple-gate
air slalom. Since the figure-eight route commanded a full swing of the roll angle, both right and
left-turn lateral motion primitives were used almost equally. The maximum overshoot in the
regulation of φ was about 10 degrees when φd changed quickly. The altitude gradually decreased
since the execution of a lateral motion lost the altitude by up to 5m. When the altitude was
below 15m, a longitudinal motion was executed to recover the altitude.
104 CHAPTER 6. AIR-SLALOM EXPERIMENT

Scene #1 Scene #2 Scene #3 Scene #4

[m]
100

50

#1

#2
0

#4
#3

−50

−100
−100 −50 0 50 100 [m]

Altitude [m]
30
20
10
0

Figure 6.11: Navigation result on the multiple-gate air slalom task: two gates are placed in the workspace
and their orientations are unchanged and fixed by the blue square target. The whole route to enter both
gates in a row is completed twice. The images at the top show the gate observations at the corresponding
moments indicated in the plot.
6.4. EXPERIMENTAL RESULTS 105

37.6 sec 43.5 sec 48.0 sec 50.4 sec 54.5 sec

64.6 sec 67.8 sec 70.9 sec 75.8 sec 79.7 sec

81.3 sec 84.2 sec 85.7 sec 87.4 sec 97.4 sec

100.3 sec 102.1 sec 104.3 sec 106.3 sec 116.6 sec

Figure 6.12: MDP motion planning results in the multiple-gate air slalom task. At the end of execution
of each motion primitive, a new optimal plan to the gate is found from the precomputed MDP look-up
table and sent to the autopilot. The first two trials to approach the first gate fail due to insufficient target
observation time, but the entrance is successful at the third approach. The gray area indicates the camera’s
field-of-view. The color of each motion primitive in the planned path represents its final roll angle (30
degrees in red, 20 in green, 10 in blue, and 0 in black).
106 CHAPTER 6. AIR-SLALOM EXPERIMENT

roll [deg]
40

20

0
φd
−20
φ
−40
140 160 180 200 220 240 260 280
[sec]
pitch [deg]
10

−10

−20 θd
−30 θ

−40
140 160 180 200 220 240 260 280
[sec]
altitude [m]
30

20

10 zd
z
0
140 160 180 200 220 240 260 280 [sec]

speed [m/sec]
15
Va,d
Va
10

5
140 160 180 200 220 240 260 280
[sec]

Figure 6.13: Autopilot control outputs to generate lateral and longitudinal motion primitives during the
multiple-gate air slalom task in Figure 6.11. The MDP planner provides the desired roll angle φd which is
switched between -30 to 30 degrees in a 10-degree resolution. The crooked φd in a high frequency indicates
the line following mode near the gate entrance. Altitude recovery by the peaks in the desired pitch angle
occurs only when the entrance is completed and the altitude is less than 15m.
Chapter 7

Conclusion and Future Work

This thesis presented vision-based algorithms and motion planning methods that are useful and
efficient building blocks for the autonomous navigation system for a small unmanned fixed-wing
airplane operating in an urban environment. The problems we address are how to provide a
driftless attitude estimation for flight control, how to conveniently calibrate visual and inertial
sensors in a working field, and how to generate a feasible motion plan that considers limited
agility in vehicle motion. The solutions to these problems are successfully integrated and demon-
strated in the air slalom task, which contains fundamental utilities required to accomplish typical
autonomous navigation tasks presented in an urban area. In this chapter, we summarize these
approaches and review the contributions, and propose some ideas for future work.

7.1 Summary

Visual-inertial attitude estimation Our approach to the gyroscope drift problem in the
attitude estimation is to use gravity-related line segments available in an urban scene. The regu-
larity existing in vertical and horizontal edges of man-made structures, which is associated with
vanishing point detection, can be effectively used to bound the attitude drift. In the Extended
Kalman filter which combines inertial and visual measurements, we presented new update equa-
tions for vertical and horizontal line segments, respectively. To achieve a fast and efficient line
classification in the presence of outliers, the hierarchical clustering uses a current best attitude
estimate as the prior in a RANSAC framework. The flight experiment around a football field at
15m high showed that the maximum attitude error during dynamic maneuvers is bounded to 5
degrees, which is two times smaller than that of the inertial-only method.

Sensor calibration for visual and inertial fusion Traditional calibration methods may
be meticulous but they demand expertise, labor, time and specialized equipment. Instead we

107
108 CHAPTER 7. CONCLUSION AND FUTURE WORK

presented a new self-calibration method that requires no prior knowledge of a sensor setting and
easily performs in a field using natural reference features. We proved that the factorization method
can provide a linear solution for intrinsic parameters of an IMU using known magnitudes of exerted
forces, which are conveniently obtainable from gravity at a static condition for accelerometers,
and tracks of distant scene points for gyroscopes, respectively. Once the IMU is calibrated,
vertical edges of buildings are used for the calibration of a camera and sensor-to-sensor relative
orientation. The simulation and experiments showed that the accuracy of our linear method is
fairly comparable to that of other object-based methods or nonlinear solutions.

Trim state-based motion planning The challenge in motion planning for a fixed-wing is to
provide a feasible motion plan that can be executable by flight controllers. In addition, not only
must task-oriented constraints like a minimum target observation time be taken into account but
motion uncertainty due to wind or control error must also be considered for planning.
We addressed these problems through a discrete planning based on the finite trim states of a
fixed wing. A set of discrete primitive motions was constructed as transitions between the trim
states; the interconnections of these motion primitives were explored to find a feasible motion
plan for reaching a goal. Rather than seeking for the shortest path, we used a probabilistic model
of motion uncertainty and maximized the probability of success in a goal region in the MDP. A
minimum target observation time is guaranteed by setting the visibility region as an intermediate
goal. The air-slalom experiments demonstrated that the MDP-based motion planner successfully
guides the vehicle to enter a gate.

Air-slalom task experiments We designed the air-slalom task where the UAV’s mission is
to search and locate several gates on the ground, and pass through them precisely. The purpose
of this task is to evaluate the capability of our autonomous navigation system in performing
maneuvers commonly required in urban operations. The successful completion of air-slalom
courses composed of single or multiple gates demonstrated the effectiveness of our navigation
system including the motion planning based on finite trim states, and the vehicle and target
states estimated by a low-grade GPS, IMU, barometer and camera.

7.2 Contributions
The following is a list of the contributions presented in this thesis in the areas of attitude esti-
mation, sensor calibration, and motion planning for unmanned fixed-wing airplanes operating in
an urban environment:
• A sensor fusion algorithm that combines inertial gyroscope and visual line segment mea-
surements and produces driftless attitude estimation.
7.3. FUTURE WORK 109

• A factorization-based IMU calibration method that provides a linear solution for a redun-
dant IMU and an iterative solution for a triad IMU.
• A thorough analysis of the linear reconstruction space of the factorization and an identifi-
cation of the rank condition and degenerate load configurations.
• An efficient discrete motion planning method based on trim states that reduces the dimen-
sion of a configuration space in planning.
• An extension of Dubins metric to 3D for search-based planning.

• A development of a low-cost deployable and expendable fixed-wing UAV system and an


integration of the proposed algorithms into the navigation system.
• A demonstration of the air-slalom task for a single and multiple gates.

7.3 Future Work

The urban operation of a small fixed-wing airplane is an emerging research field that addresses
challenging problems both in innovative system integration and the inevitable trade-off between
payload feasibility and performance. A fully autonomous and intelligent UAV still requires more
advanced perception, control and planning; this thesis contributed to this endeavor in part with
the presented methods and results. The following paragraphs summarize future research direc-
tions on the topics dealt with in this thesis and possible enhancements of the proposed algorithms.

Line feature tracking In Chapter 3, line measurements used for the attitude estimation are
independently extracted in each image. Hence, the proposed line classification method always
requires a fresh start for a new set of lines. If line correspondences are available through tracking,
and their associations with vanishing points at the previous frame are known, this would signifi-
cantly reduce the time needed for the classification process. It is also possible to add the heading
direction ψ into the Kalman estimator in a situation where one prevalent horizontal vanishing
point can be tracked over frames.

More dynamic maneuvers The motion primitives used in the experiments have a maximum
of 30 degrees in the roll angle and thus produce mild maneuvers. Since they are only a subset of
all possible maneuvers, more dynamic maneuvers can be considered by increasing the maximum
roll angle or adding aerobatic motions like looping. These will provide tighter turns and more
flexible motions in 3D. The challenge, however, is that dynamic maneuvers also require high-grade
sensors for more accurate state estimation, and an advanced control technique to stabilize the
flight.
110 CHAPTER 7. CONCLUSION AND FUTURE WORK

Obstacle detection and avoidance In the air-slalom experiment, one essential but missing
component that was not tested or fully demonstrated for the autonomous navigation is obstacle
detection and avoidance. Simple floating obstacles like a balloon or virtual no-go regions can be
placed for an obstacle avoidance test. To improve the quality of an obstacle-free path generated by
the greedy-first search in Chapter 5.3, it would be possible to integrate the search-based planning
with the MDP-based optimal path in order to produce a more smooth and near-optimal plan
that dodges obstacles.

Experiment in a more realistic urban area For safety reasons, the experiment site where we
demonstrated the air-slalom task was a demolished urban area. Based on the proven navigation
capability to run a slalom course and enter a gate, the next challenging missions for our UAV
system could provide more realistic urban task scenarios, such as traveling through a passageway
between buildings or approaching a target on the road in a cluttered area.
Appendix A

Derivation of Jacobian for Line Angle

A.1 Derivation of Jacobian in EKF Prediction

Let ∆ωi = (ωi − bi )∆T for i = x, y or z. The process model (3.10) for χ = [x; b] is rewritten in
a discrete time as χk+1 = f (χk ), i.e.,
  
φk+1 φk + (ωx − bx )∆T + sφ tθ (ωy − by )∆T + cφ tθ (ωz − bz )∆T
   
 θk+1   θk + cφ (ωy − by )∆T − sφ (ωz − bz )∆T 
   
bx,k+1  =  bx,k (A.1)
   

   
b   by,k 
 y,k+1   
bz,k+1 bz,k

∂f
Then, the Jacobian F = in the EKF state transition is given as
∂χ
 
1 + cφ tθ ∆ωy − sφ tθ ∆ωz sφ /c2θ ∆ωy + cφ /c2θ ∆ωz −∆T −sφ tθ ∆T −cφ tθ ∆T
 
 −sφ ∆ωy − cφ ∆ωz 1 0 −cφ ∆T sφ ∆T 
 
F= 0 0 1 0 0 (A.2)
 

 

 0 0 0 1 0 

0 0 0 0 1

A.2 Derivation of Jacobian in EKF Update

Let sx = vx∗ − mx , sy = vy∗ − my , s = sx /sy . Then, the line angle β in (3.13) is rewritten as

vy∗ − my
   
−1 −1 sy
β = tan = tan = tan−1 (s) (A.3)
vx∗ − mx sx

111
112 APPENDIX A. DERIVATION OF JACOBIAN FOR LINE ANGLE

and its partial derivative using the chain rule is

∂ tan−1 (s) s2 s0y sx − sy s0x vy0 sx − vx0 sy


   
∂β 1 ∂s
= = = 2 x 2 = (A.4)
∂ ∂ 1 + s2 ∂ sx + sy s2x d2
q q
As denoted in Figure 3.6, let d = s2x + s2y , r = vx∗2 + vy∗2 , nx = cφ mx − sφ my , and ny =
sφ mx +cφ my . From (3.5) and (3.9), we also can rewrite r as follows: r = 1/tθ for vv∗ and r = t2ψ /c2θ
for vh∗ , respectively.

−1 >
For vv∗ = sφ t−1
 
θ , cφ tθ in (3.5), the Jacobian for the line angle associated with a vertical
VP is derived as follows.

" # " #
∂vv∗ vx0 cφ t−1
θ
= 0 = (A.5)
∂φ vy −sφ t−1θ

vy0 sx − vx0 sy = −(sφ t−1 −1


θ )(vx − mx ) − (cφ tθ )(vy − my )

= −t−1 −1
θ (sφ vx + cφ vy ) + tθ (sφ mx + cφ my )

= −t−1 −1 −1 −1
θ (sφ sφ tθ + cφ cφ tθ ) + tθ (sφ mx + cφ my )

= −t−2 −1 2
θ + tθ ny = −r + rny (A.6)

" # " #
∂vv∗ vx0 −sφ s−2
θ
= 0 = (A.7)
∂θ vy −cφ s−2
θ

vy0 sx − vx0 sy = (−cφ s−2 −2


θ )(vx − mx ) − (−sφ sθ )(vy − my )

= −s−2 −2
θ (cφ vx − sφ vy ) + sθ (cφ mx − sφ my )

= −s−2 −1 −1 −2 −2
θ (cφ sφ tθ − sφ cφ tθ ) + sθ (cφ mx − sφ my ) = sθ nx (A.8)

For vh∗ = [ cφ tψ /cθ − sφ tθ , −sφ tψ /cθ − cφ tθ ]> in (3.9), the Jacobian for the line angle associ-
ated with a horizontal VP is derived as follows.

" # " #
∂vh∗ vx0 −sφ tψ /cθ − cφ tθ
= 0 = (A.9)
∂φ vy −cφ tψ /cθ + sφ tθ
A.2. DERIVATION OF JACOBIAN IN EKF UPDATE 113

vy0 sx − vx0 sy = (−cφ tψ /cθ + sφ tθ )(vx − mx ) + (sφ tψ /cθ + cφ tθ )(vy − my )


= (−cφ tψ /cθ + sφ tθ )(cφ tψ /cθ − sφ tθ ) + (sφ tψ /cθ + cφ tθ )(−sφ tψ /cθ − cφ tθ )
− (−cφ tψ /cθ + sφ tθ )mx − (sφ tψ /cθ + cφ tθ )my
= −(cφ tψ /cθ − sφ tθ )2 − (sφ tψ /cθ + cφ tθ )2 + tψ /cθ (cφ mx − sφ my ) − tθ (sφ mx + cφ my )
= −t2ψ /c2θ − t2θ + (tψ /cθ )nx − tθ ny = −r2 + (tψ /cθ )nx − tθ ny (A.10)

" # " #
∂vh∗ vx0 cφ tψ tθ /cθ − sφ /c2θ
= 0 = (A.11)
∂θ vy −sφ tψ tθ /cθ − cφ /c2θ

vy0 sx − vx0 sy = (−sφ tψ tθ /cθ − cφ /c2θ )(vx − mx ) − (cφ tψ tθ /cθ − sφ /c2θ )(vy − my )
= (−sφ tψ tθ /cθ − cφ /c2θ )(cφ tψ /cθ − sφ tθ ) − (cφ tψ tθ /cθ − sφ /c2θ )(−sφ tψ /cθ − cφ tθ )
+ 1/c2θ (cφ mx − sφ my ) + tψ tθ /cθ (sφ mx + cφ my )
= −(sφ tψ tθ /cθ + cφ /c2θ )(cφ tψ /cθ − sφ tθ ) + (cφ tψ tθ /cθ − sφ /c2θ )(sφ tψ /cθ + cφ tθ )
+ (1/c2θ )nx + tψ tθ /cθ ny
= tψ t2θ /cθ − tψ /c3θ + (1/c2θ )nx + (tψ tθ /cθ )ny
= (tψ /c3θ )(s2θ − 1) + (1/c2θ )nx + (tψ tθ /cθ )ny
= −(tψ /cθ ) + (1/c2θ )nx + (tψ tθ /cθ )ny
= (1/c2θ )(nx + tψ (−cθ + sθ ny )) (A.12)
114 APPENDIX A. DERIVATION OF JACOBIAN FOR LINE ANGLE
Appendix B

Discussion on Factorization-based
Calibration

In this chapter, we discuss the linear solution space of L constructed by a load constraint set C
at the redundant calibration described in Section 4.3. Let |C| denote the number of independent
constraints in C.

B.1 Practical Issues

Since the measurement Z is inevitably corrupted by noise, additional steps are required to meet
the assumptions made for the factorization. Some considerations and caveats are:
• In the SVD of Z in (4.4), a noisy Z ∈ Rm×n may have a rank greater than p. The rank-p
enforcement on Z can be made by

b = (Up Σ1/2 )(Σ1/2 V> ) = F


Z cb S
cb (B.1)
p p p

where Σp is the top left p × p block of Σ, and Up and Vp are the first p columns of U and
V, respectively. Z
b is the closest rank-p matrix to Z in the sense of a Frobenius norm.

• When rank(L) = 10, a least squares solution qp (4.16) may result in a full-rank Q that
violates the rank condition of q-constraint (4.15a). To enforce the rank-3 Q, the smallest
singular value of Q is set to zero by the SVD of Q.
• When rank(L) = 9, multiple solutions for q may exist (case 2 in Fig. 4.4) but corresponding
Sb are discriminative enough to identify inappropriate ones that have severely distorted and
warped shapes. Since the bias is generally uniform, a true solution can be selected as one
of the smallest variances of b in Sb .

115
116 APPENDIX B. DISCUSSION ON FACTORIZATION-BASED CALIBRATION

• If measurement noise in Z is too high, the solution for q does not always produce a positive-
definite Q3×3 . Thus, the Cholesky decomposition fails to obtain A3×3 . This sensitivity to
noise can be reduced if further loads are made, so that more constraints are used.

B.2 Numerical Example

This is an example of when measurements are collected into Z with no measurement noise from
a four-component accelerometer unit and ten different attitudes in a static condition (n = 4, m =
10). From the SVD of Z with rank(Z) = 4, Fcb and S cb are computed as

Z = F
cb S
cb
   
0.134 1.433 0.750 1.393 −0.75 0.22 0.64 −0.26
0.134 0.567 0.750 1.826 −0.67 0.71 0.21 −0.32
   
−0.78
0.412 1.000 0.191 1.866
  0.49 0.24 0.40


0.792 1.847 0.511
 −0.86 −0.30
1.026 0.48

0.03 −1.25 −1.40 −0.60 −1.61

 
0.792 0.153 0.511 1.873 −0.70 0.66 −0.38 −0.08−0.68 −0.72 0.08 1.13
   
 =  
1.208 1.847 0.511 0.818 −0.88 −0.53 0.26 0.01 −1.00
 0.90 0.09 −0.04
   
1.208 1.000 0.022 1.588 −0.85 0.11 −0.20 0.57 0.04 0.13 −0.84 0.17
   
1.588 1.701 0.595 0.642 −0.89 −0.68 −0.02 −0.13
   
1.588 0.299 0.595 1.342 −0.76 0.11 −0.73 −0.23
1.866 1.000 0.500 0.921 −0.85 −0.41 −0.53 −0.08

From Cτ∗=1 , L has a one-dimensional null space and the corresponding solutions for q and α are
 >
qp = 1.28 −0.15 −0.01 0.00 0.39 0.01 −0.09 0.42 0.00 0.15
 >
qn = 0.40 −0.11 0.03 0.23 −0.55 −0.01 0.10 −0.65 0.00 −0.21

α= 0.65 0.65 0.65 −2.14

Since only α = −2.14 satisfies the q-constraint (4.15), the true Sb and Fb are recovered as
 
    −0.87 0.43 −0.25 1
0.65 0 0 −1.24
1 0 0 −0.50 −0.87 −0.43 −0.25 1
 0.13 1.24 0 0 1 0 −0.50
0.18

.. .. ..

..
A4×4 = , S = , F =

b b . . . .
   
−0.11 0.03 1.34 −0.01 0 0 1 −0.71  
 0.59 −0.70 −0.40 1
−0.76 −0.17 −0.05 −0.12 1 1 1 1
0.87 0.00 −0.50 1

B.3 Minimum number of load constraints

The factorization-based calibration is equivalent to finding q subject to the q-constraint (4.15).


The 9 degrees-of-freedom in q can be verified again from the fact that, for a given Q, A3×3
B.4. GRAVITY MAGNITUDE CONSTRAINT 117

in A4×3 is up to a rotation matrix R (12 − 3 = 9). Hence, nine load constraints (k = 9) are
sufficient to find q,

|C|min = 9 (B.2)

which corresponds when rank(L) = 9 in (4.17).


The required number of calibrating loads (m) varies depending on how individual loads are
associated with each other in C. For example, when |C| = |C|min , the maximum increases up
to m = 18 for C = {c12 , c34 , c56 , . . .}, and the minimum is m = 4 when C = {c11 , c22 , c33 ,
c44 , c12 , c13 , c14 , c23 , c24 } from which a linearly dependent c34 is excluded. The minimal load
condition (m = 4) coincides with the case where a matrix Fb ∈ R4×4 is completely known and
the calibration is directly solvable by Sb = F−1
b Z.

B.4 Gravity magnitude constraint


From Definition 1, a load constraint set solely composed of the gravity magnitude is Cτ∗=1 (a known
constant magnitude is scaled to 1 without loss of generality). Firstly, the following lemmas are
derived to investigate the rank of L.
Definition 2. rank(L|F cb , C) is the rank of L when F
cb and C are given for the construction of L
in (4.13).
Lemma 1. rank(L|F
cb , C) = rank(L|Fb , C) for any C.
Lemma 2. rank(L|F
cb , C) = min{10, rank(S[fb,i ⊗ fb,j ])}

Proof. (for Lemma 1 and 2) From (4.7), fb,i = A>


4×4 fb,i . The mixed-product and transpose
b
property of the Kronecker product [51] rewrite L in (4.12) as follows.

cb , C) = S[b
L(F fb,j ]> T16×10
fb,i ⊗ b (B.3)
= S[fb,i ⊗ fb,j ]> (A4×4 ⊗ A4×4 )> T16×10 , (B.4)
L(Fb , C) = S[fb,i ⊗ fb,j ]> T16×10 (B.5)

From rank(XY) = min{rank(X), rank(Y)}, L has the same rank for Fb and F
cb since rank(A4×4 ⊗
A4×4 ) = 16 and rank(T16×10 ) = 10. Lemma 2 is proven by (B.5).

Lemma 1 and 2 enable us to analyze the rank of the linear system L(F b b , C) in (4.13) using a
b b . When C = C ∗ , L has the following properties.
true Fb instead of F τ =1

Proposition 1. When C = Cτ∗=1 , rank(L) ≤ 9.


118 APPENDIX B. DISCUSSION ON FACTORIZATION-BASED CALIBRATION

Proof. Let fi = [xi , yi , zi ]> . The column expansion of the matrix S[fb,i
> ⊗ f > ] with f > = [f > 1]
b,i b,i
shows that it has six duplicated columns in (B.6) and thus its maximum rank is bounded to 10
in (B.7).
> >
rank(S[fb,i ⊗ fb,i ]) = rank(S[(fi ⊗ fi )> fi> fi> 1]) (B.6)
= rank(S[x2i yi2 zi2 xi yi yi zi zi xi xi yi zi 1]) (B.7)
= rank(S[x2i yi2 zi2 xi yi yi zi zi xi xi yi zi ]) (B.8)
≤9 (B.9)

Since all the loads in Cτ∗=1 are subject to x2i + yi2 + zi2 − 1 = 0, the first three columns and the
last column in (B.7) are linearly dependent and thus the maximum rank decreases by one. From
Lemma 2, rank(L) ≤ 9 for Cτ∗=1 .

Proposition 2. When C = Cτ∗=1 , only one root of f (α) satisfies the q-constraint.

Proof. Q = A4×3 A> >


4×3 = A4×4 diag(1, 1, 1, 0) A4×4 from (4.11). Suppose A4×4 = I. From
fb,i = fb,i = [xi yi zi 1]> and Cτ∗=1 (x2i + yi2 + zi2 = 1 for all i), a linear system Lq = 1 is satisfied
b
by the following particular and null solution:

L = S[x2i 2xi yi 2zi xi 2xi yi2 2yi zi 2yi zi2 2zi 1] (B.10)

1
qp = [ 1 0 0 0 1 0 0 1 0 3 ] > (B.11)
4
1
qn = [ 1 0 0 0 1 0 0 1 0 -1]> (B.12)
2
Then, q = qp + αqn is equal to a symmetric matrix Q s.t.
   
q1 1+2α
   
q2 q5  1  0 1+2α 
Q =  = 
  (B.13)
q3 q6 q8  4 0 0 1+2α
 

q4 q7 q9 q10 0 0 0 3−2α

From det(Q) = 0 in (4.15a), this first q-constraint produces f (α) = (1 + 2α)3 (3 − 2α) = 0 which
has two real roots, α = 1.5 or −0.5. Only α = 1.5 is true because α = −0.5 cannot produce a
positive definite Q3×3 . This uniqueness of α holds for the multiplication of Q by any invertible
matrix A4×4 .

Regardless of how many measurements (m ≥ |C|min ) are collected in Z, Propositions 1 and


2 claim that the accelerometer calibration based on the gravity magnitude should take a one-
dimensional null space of L into account and a unique reconstruction exists for Sb .
B.5. DEGENERATE LOAD CONFIGURATION 119

Q2 : cone
fm

f2
f1 Q1 : unit sphere
(kfi k = 1; i = 1; : : : ; m)

Figure B.1: Degenerate condition for calibrating loads F in Cτ∗=1 : F is degenerate if F lies on the
interchapter of Q1 (sphere) and Q2 (any other quadric, e.g., cone).

¼(µ1 ) ¼(Á1 )
¼(Á1 )

¼(Á2 )

¼(Á2 )
¼(Á3 )
¼(µ2 )

(a) F 2 DF (b) F 2 DF (c) F 62 DF

Figure B.2: Examples of degenerate and non-degenerate calibrating load sets F in Cτ∗=1 : (a) F ∈ DF
when F on Q2 : π(φ1 )π(φ2 ) generated by two discrete roll angles φ, (b) F ∈ DF when F on Q2 : π(θ1 )π(θ2 )
generated by two discrete pitch angles θ, (c) F 6∈ DF when F on a non-quadric π(φ1 )π(φ2 )π(φ3 ) generated
by three discrete roll angles.

B.5 Degenerate load configuration


Proposition 1 shows the case where a particular load constraint set produces a null space in L.
We also investigate the case where a certain geometric distribution of loads (f1 , . . . , fm ) causes a
null space N (L). If the dimension of a null-space is greater than one, i.e., rank(L) < 9, a linear
system L is unsolvable and it leads to a failure in the factorization.

Definition 3. DF is a set of degenerate load distributions F = (f1 , . . . , fm ) that has rank(L|Fb , C) <
9.

Proposition 3. When C = C ∗ , N (L) 6= ∅ if and only if a quadric Q exists for F.

Proof. Let LS be the stacked matrix in (B.7). N (L) 6= ∅ if a non-zero a exists for LS a = 0. For a
given C ∗ , a can be represented by a quadric Q ∈ R4×4 : fb,i
> Q(a) f 2 2 2
b,i = 0, i.e., a0 xi + a1 yi + a2 zi +
120 APPENDIX B. DISCUSSION ON FACTORIZATION-BASED CALIBRATION

a3 xi yi + a4 yi zi + a5 zi xi + a6 xi + a7 yi + a8 zi + a9 = 0 for all i. Hence, N (L) 6= ∅ is equivalent to


the existence of a quadric Q represented by all the loads in F.

Proposition 4. When C = Cτ∗=1 , F ∈ DF if and only if a non-sphere quadric Q2 exists for F.

Proof. The degeneracy condition, rank(L) < 9, is the same where at least two linear independent
vectors (a1 , a2 , . . .) exist such that LS a1 = LS a2 = · · · = 0. Proposition 1 shows that Cτ∗=1
already has a1 = [1 1 1 0 0 0 0 0 0 0 -1]> which is a unit sphere Q1 : x2i + yi2 + zi2 − 1 = 0. Also,
a2 represents another quadric Q2 which should be not a unit sphere.

From Proposition 3, geometric properties of DF for an intra-relation constraint C ∗ can be ana-


lyzed via 3D quadric fitting. The dimension η of N (L) is equal to how many quadrics (Q1 , . . . , Qη )
can represent F simultaneously. For example, Proposition 4 for Cτ∗=1 is the case η = 2 when the
∗ are distributed at the interchapter of a unit sphere Q and any other quadric Q
loads in F ∈ DF 1 2
like Fig. B.1. In essence, DF in Cτ∗=1 is determined by whether another possible quadric fitting of
F exists other than the original constraint (geometrically a unit sphere) as the following remarks:
Remark 1. When C = Cτ∗=1 , F ∈ DF if F is on one or two cutting planes of a unit sphere.

Remark 2. When C = Cτ∗=1 , F 6∈ DF if F is on three and more cutting planes of a unit sphere.

The load set F distributed on two planes like Fig. B.2(a-b) is represented by π = π 1 π 2 =
(a1 xi + b1 yi + c1 zi + d1 )(a2 xi + b2 yi + c2 zi + d2 ) = 0. It is degenerate since this polynomial π
can be represented by a non-sphere quadric Q2 . Otherwise, if F is distributed on more than two
planes like Fig. B.2(c), it is non-degenerate because a cubic polynomial ( n≥3
Q
i=1 π i = 0) cannot be
represented by a quadric.
When an IMU is randomly oriented by hand in the calibration, it is far less likely that F forms
a non-sphere quadric. However, attention should be paid when pivotal devices like a tripod are
used. One should make sure to generate at least three roll or pitch angles during measurement
collection.
Bibliography

[1] A. Puri, “A survey of unmanned aerial vehicles for traffic surveillance,” Department of
Computer Science and Engineering, University of South Florida, Tech. Rep., 2004.
[2] D. Floreano, J.-C. Zufferey, M. Srinivasan, and C. Ellington, Flying Insects and Robots.
Springer-Verlag, 2009.
[3] Red Bull Air Race, International Series of Air Races, 2011. [Online]. Available:
http://www.redbullairrace.com
[4] M. Euston, P. Coote, R. Mahony, J. Kim, and T. Hamel, “A complementary filter for attitude
estimation of a fixed-wing uav,” in Proc. IEEE/RSJ Int’l Conf. on Intelligent Robots and
Systems, Sept 2008, pp. 340–345.
[5] W. Premerlani and P. Bizard, Direction Cosine Matrix IMU: Theory, Mar. 2009. [Online].
Available: http://gentlenav.googlecode.com/files/DCMDraft2.pdf
[6] B. Taylor, C. Bil, and S. Watkins, “Horizon sensing attitude stabilisation: A VMC autopilot,”
in 18th International UAV Systems Conference, 2003.
[7] S. M. Ettinger, M. C. Nechyba, P. G. Ifju, and M. Waszak, “Vision-guided flight stability
and control for micro air vehicles,” in Proc. IEEE/RSJ Int’l Conf. on Intelligent Robots and
Systems, 2002, pp. 2134–2140.
[8] S. Todorovic, M. C. Nechyba, and P. G. Ifju, “Sky/ground modeling for autonomous MAV
flight,” in Proc. IEEE Int’l Conf. on Robotics and Automation, May 2003, pp. 1422–1427.
[9] T. Cornall, G. Egan, and A. Price, “Aircraft attitude estimation from horizon video,” Elec-
tronic Letters, vol. 42, no. 12, 2006.
[10] C. Demonceaux, P. Vasseur, and C. Pégard, “Omnidirectional vision on UAV for attitude
computation,” in Proc. IEEE Int’l Conf. on Robotics and Automation, 2006, pp. 2842–2847.
[11] D. Dusha, W. Boles, and R. Walker, “Attitude estimation for a fixed-wing aircraft using
horizon detection and optical flow,” in 9th Biennial Conference of the Australian Pattern
Recognition Society on Digital Image Computing Techniques and Applications, Dec 2007, pp.

121
122 BIBLIOGRAPHY

485–492.
[12] P. Denis, J. H. Elder, and F. J. Estrada, “Efficient edge-based methods for estimating man-
hattan frames in urban imagery,” in European Conf. on Computer Vision, 2008, pp. 197–210.
[13] J. A. Shufelt, “Performance evaluation and analysis of vanishing point detection techniques,”
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 21, no. 3, pp. 282–288, 1999.
[14] M. E. Antone and S. Teller, “Automatic recovery of relative camera rotations for urban
scenes,” in IEEE Conf. on Computer Vision and Pattern Recognition, Sept 2000.
[15] C. Demonceaux, P. Vasseur, and C. Pégard, “UAV attitude computation by omnidirectional
vision in urban environment,” in Proc. IEEE Int’l Conf. on Robotics and Automation, 2007.
[16] J.-C. Bazin, I. Kweon, C. Demonceaux, and P. Vasseur, “UAV attitude estimation by vanish-
ing points in catadioptric images,” in Proc. IEEE Int’l Conf. on Robotics and Automation,
2008.
[17] D. Gebre-Egziabher, G. Elkaim, J. Powell, and B. Parkinson, “A gyro-free quaternion-based
attitude determination system suitable for implementation using low cost sensors,” in IEEE
Symposium on Position Location and Navigation, 2000.
[18] Y. Li, K. Zhang, C. Roberts, and M. Murata, “On-the-fly GPS-based attitude determination
using single- and double-differenced carrier phase measurements,” GPS Solutions, vol. 8,
no. 2, pp. 93–102, 2004.
[19] N. Metni, J.-M. Pflimlin, T. Hamel, and P. Soueres, “Attitude and gyro bias estimation for
a flying UAV,” in Proc. IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems, 2005, pp.
1114–1120.
[20] A. M. Eldredge, “Improved state estimation for miniature air vehicles,” Master’s thesis,
Brigham Young University, Provo, Utah, Tech. Rep., Dec.
[21] M. Hwangbo and T. Kanade, “Visual-inertial UAV attitude estimation using urban scene
regularities,” in Proc. IEEE Int’l Conf. on Robotics and Automation, Sept 2011, pp. 340–345.
[22] S. T. Barnard, “Interpreting perspective images,” Artificial Intelligence, vol. 21, pp. 435–462,
1984.
[23] G. Welch and G. Bishop, “An introduction to the kalman filter,” TR 95-041, Department of
Computer Science, University of North Carolina at Chapel Hill, Tech. Rep., 1995.
[24] M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting
with applications to image analysis and automated cartography,” Comm. of the ACM, vol. 24,
p. 381395, 1981.
[25] C. Rother, “A new approach for vanishing point detection in architectural environments,”
123

Journal Image and Vision Computing, Special Issue on BMVC 2000, vol. 20, no. 9–10, pp.
647–656, 2002.
[26] D. G. Aguilera, J. G. Lahoz, and J. F. Codes, “A new method for vanishing points detection
in 3D reconstruction from a single view,” in Proceedings of the ISPRS Commission V, 2005,
pp. 197–210.
[27] H. Wildenauer and M. Vincze, “Vanishing point detection in complex man-made worlds,”
in IEEE Int’l Conf. on Image Analysis and Processing, 2007, pp. 615–622.
[28] J. F. Canny, “A computational approach to edge detection,” IEEE Trans. on Pattern Anal-
ysis and Machine Intelligence, vol. 8, no. 6, pp. 679–698, 1986.
[29] R. M. Rogers, Applied Mathematics in Integrated Navigation Systems (AIAA Education
Series), 2nd ed. AIAA, 2003.
[30] J. J. Hall and R. L. W. II, “Inertial measurement unit calibration platform,” Journal of
Robotic Systems, vol. 17, no. 11, pp. 623–632, 2000.
[31] J. W. Diesel, “Calibration of a ring laser gyro inertial navigation system,” in 13th Biennial
Guidance Test Symposium, Oct. 1987.
[32] E. Nebot and H. Durrant-Whyte, “Initial calibration and alignment of low-cost inertial
navigation,” Journal of Robotic Systems, vol. 16, no. 2, pp. 81–92, 1999.
[33] A. Kim and M. F. Golnaraghi, “Initial calibration of an inertial measurement unit using
an optical position tracking system,” in IEEE Position Location and Navigation Symposium
(PLANS 2004), Apr. 2004.
[34] Z. Dong, G. Zhang, Y. Luo, C. C. Tsang, G. Shi, S. Y. Kwok, W. Li, P. Leong, and M. Y.
Wong, “A calibration method for mems inertial sensors based on optical tracking,” in 2nd
IEEE Int’l Conf. on Nano/Micro Engineered and Molecular Systems, 2007.
[35] P. Lang and A. Pinz, “Calibration of hybrid vision / inertial tracking system,” in Proc. 2nd
Workshop on Integration of Vision and Inertial Sensors (INERVIS’05), 2005.
[36] J. Lobo and J. Dias, “Relative pose calibration between visual and inertial sensors,” Inter-
national Journal of Robotics Research, vol. 23, no. 12, pp. 1157–1195, 2004.
[37] F. M. Mirzaei and S. I. Roumeliotis, “A Kalman filter-based algorithm for IMU-camera
calibration: Observability analysis and performance evaluation,” IEEE Trans. on Robotics,
vol. 24, no. 5, pp. 1143–1156, 2008.
[38] J. Kelly and G. S. Sukhatme, “Fast relative pose calibration for visual and inertial sensors,”
in 11th Int’l Symp. Experimental Robotics (ISER’08), 2008.
[39] J. D. Hol, T. B. Schön, and F. Gustafsson, “Modeling and calibration of inertial and vision
124 BIBLIOGRAPHY

sensors,” International Journal of Robotics Research, vol. 29, no. 2–3, pp. 231–244, 2010.
[40] S. Y. Cho and C. G. Park, “A calibration technique for a redundant IMU containing low-
grade inertial sensors,” ETRI Journal, vol. 27, no. 4, pp. 418–426, 2005.
[41] A. Lai, D. A. James, J. P. Hayes, and E. C. Harvey, “Semi-automatic calibration technique
using six inertial frames of reference,” in Proceedings of the SPIE, vol. 5274, 2004, pp. 531–
542.
[42] I. Skog and P. Handel, “Calibration of a MEMS inertial measurement unit,” in XVII IMEKO
World Congress, Brazil, Sep. 2006.
[43] Z. F. Syed, P. Aggarwal, C. Goodall, X. Niu, and N. El-Sheimy, “A new multi-position
calibration method for MEMS inertial navigation systems,” Measurement Science and Tech-
nology, vol. 18, no. 7, pp. 1897–1907, 2007.
[44] S. P. Won and F. Golnaraghi, “A triaxial accelerometer calibration method using a mathe-
matical model,” IEEE Transaction on Instrumentation and Measurement, vol. 59, no. 8, pp.
2144–2153, 2010.
[45] C. Tomasi and T. Kanade, “Shape and motion from image streams under orthography: a
factorization method,” International Journal of Computer Vision, vol. 9, no. 2, pp. 137–154,
1992.
[46] R. M. Voyles, J. D. Morrow, and P. K. Khosla, “Shape from motion approach to rapid and
precise force/torque sensor calibration,” in Int’l Mechanical Eng Congress and Exposition,
Nov. 1995, pp. 2171–2175.
[47] ——, “Including sensor bias in shape from motion calibration and sensor fusion,” in IEEE
Multisensor Fusion and Integration Conference, Dec. 1996.
[48] T. Kanade and D. Morris, “Factorization methods for structure from motion,” Philosophical
Transactions: Mathematical, Physical and Engineering Sciences, vol. 356, pp. 1153–1173,
1998.
[49] K. Kim, Y. Sun, R. M. Voyles, and B. J. Nelson, “Calibration of multi-axis mems force
sensors using the shape-from-motion method,” IEEE Sensors Journal, vol. 7, no. 3, pp.
344–350, 2007.
[50] M. Hwangbo and T. Kanade, “Factorization-based calibration method for mems inertial
measurement unit,” in Proc. IEEE Int’l Conf. on Robotics and Automation, San Hose, USA,
Apr. 2008.
[51] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd ed. The Johns Hopkins University
Press, 1996.
125

[52] Analog Device, ADXL203 Precision Dual-Axis iMEMS Accelerometer, Mar. 2006. [Online].
Available: http://www.analog.com/en/sensors/inertial-sensors/adxl203/products/product.
html
[53] R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cam-
bridge University Press, 2004.
[54] R. Hartley, “Self-calibration from multiple views with a rotating camera,” in European Conf.
on Computer Vision, 1994, pp. 471–478.
[55] ——, “Self-calibration of stationary cameras,” International Journal of Computer Vision,
vol. 22, no. 1, pp. 5–23, 1997.
[56] K. S. Arun, T. S. Huang, and S. D. Bolstein, “Least-squares fitting of two 3-d point sets,”
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp. 698–700, 1987.
[57] M. Hwangbo, J.-S. Kim, and T. Kanade, “Inertial-aided KLT feature tracking for a moving
camera,” in Proc. IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems, 2009, pp. 1909–
1916.
[58] W. Sepp and S. Fuchs, DLR CalLab and CalDe: The DLR Camera Calibration Toolbox,
2008. [Online]. Available: http://www.robotic.dlr.de/callab/
[59] D. Liebowitz and A. Zisserman, “Metric rectification for perspective images of planes,” in
IEEE Conf. on Computer Vision and Pattern Recognition, Jun. 1998, pp. 482–488.
[60] B. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon, “Bundle adjustment – a modern
synthesis,” in Vision Algorithms: Theory and Practice, ser. Lecture Notes in Computer
Science. Springer Berlin, 2000, vol. 1883, pp. 153–177.
[61] B. Sinopoli, M. Micheli, G. Donata, and T. Koo, “Vision based navigation for an unmanned
aerial vehicle,” in Proc. IEEE Int’l Conf. on Robotics and Automation, 2001.
[62] R. Zapata and P. Lepinay, “Flying among obstacles,” in European Workshop on Advanced
Mobile Robots, 1999.
[63] J. Sasiadek and I. Duleba, “3d local trajectory planner for uav,” Journal Journal of Intelli-
gent and Robotic Systems, vol. 29, no. 2, pp. 191–210, Oct. 2000.
[64] C. Reynolds, “Steering behaviors for autonomous characters,” in Game Developers Confer-
ence, 1999.
[65] J. Go, T. D. Vu, and J. Kuffner, “Autonomous behaviors for interactive vehicle animations,”
Graphics Models, vol. 68, no. 2, pp. 90–112, 2006.
[66] J. F. Canny, The Complexity of Robot Motion Planning. MIT, Cambridge, 1988.
[67] J.-C. Latombe, Robot Motion Planning. Kluwer Academics Publication, 1991.
126 BIBLIOGRAPHY

[68] H. Choset, K. M. Lynch, S. Hutchinson, G. Kantor, W. Burgard, and S. Thrun, Principles


of Robot Motion: Theory, Algorithms and Implementations. MIT Press, 2005.
[69] S. M. LaValle, Planning Algorithms. Cambridge University Press (also available at
http://msl.cs.uiuc.edu/planning/), 2006.
[70] A. G. Barto, S. J. Bradtke1, and S. P. Singh2, “Learning to act using real-time dynamic
programming,” Artificial Intelligence, vol. 72, no. 1-2, pp. 81–138, 1995.
[71] K. Tarabanis, P. Allen, and R. Tsai, “A survey of sensor planning in computer vision,” IEEE
Trans. on Robotics and Automation, vol. 11, no. 1, pp. 86–104, 1995.
[72] P. Svestka and M. Overmars, “Coordinated motion planning for multiple car-like robots
using probabilistic roadmaps,” in Proc. IEEE Int’l Conf. on Robotics and Automation, May
1995, pp. 1631–1636.
[73] D. Ferguson, M. Likhachev, and A. Stentz, “A guide to heuristic-based path planning,” in
Proceedings of ICAPS Workshop on Planning under Uncertainty for Autonomous Systems,
AAAI, June 2005.
[74] E. Frazzoli, M. Dahleh, and E. Feron, “Real-time motion planning for agile autonomous
vehicles,” AIAA Journal of Guidance, Control and Dynamics, vol. 25, no. 1, pp. 116–129,
2002.
[75] ——, “A hybrid control architecture for aggressive maneuvering of autonomous helicopters,”
in IEEE Conf. on Decision and Control, 1999.
[76] O. Yakimenko, “Direct method for rapid prototyping of near-optimal aircraft trajectories,”
AIAA Journal of Guidance, Control and Dynamics, vol. 23, no. 5, pp. 865–875, 2000.
[77] V. Gavrilets, E. Frazzoli, and B. Mettler, “Aggressive maneuvering of small autonomous
helicopters: A human-centered approach,” International Journal of Robotics Research.
[78] C. Dever, B. Mettler, E. Feron, J. Popovic, and M. McConley, “Trajectory interpolation
for parametrized maneuvering and flexible motion planning of autonomous vehicles,” AIAA
Journal of Guidance Control and Dynamics, vol. 29, no. 2, pp. 1–28, 2006.
[79] B. Mettler and Z. Kong, “Receding horizon trajectory optimization with a finite-state value
function approximation,” in American Control Conference (ACC), 2008, pp. 3810–3816.
[80] M. Hwangbo, J. Kuffner, and T. Kanade, “Efficient two-phase 3d motion planning for small
fixed-wing UAVs,” in Proc. IEEE Int’l Conf. on Robotics and Automation, Apr 2007, pp.
1035–1041.
[81] R. W. Beard, An Introduction to Autonomous Miniature Air Vehicles. Unpublished, 2006.
[82] L. E. Dubins, “On curves of minimal length with a constraint on average curvature, and with
127

prescribed initial and terminal positions and tangents,” American Journal of Mathematics,
vol. 79, pp. 497–516, 1957.
[83] X.-N. Bui, P. Soueres, J.-D. Boissonnat, and J.-P. Laumond, “Shortest path synthesis for
dubins nonholonomic robot,” in Proc. IEEE Int’l Conf. on Robotics and Automation, 1994.
[84] D. P. Bertsekas, Dynamic Programming and Optimal Control, 2nd ed. Athena Scientific,
Belmont, MA, 2000.
[85] R. Alterovitz, M. Branicky, and K. Goldberg, “Constant-curvature motion planning under
uncertainty with applications in image-guided medical needle steering,” in In Workshop on
the Algorithmic Foundations of Robotics, 2006.
[86] D. Nelson, D. Barber, T. McLain, and R. Beard, “Vector field path following for miniature
air vehicles,” IEEE Trans. on Robotics, vol. 23, no. 3, pp. 519–529, Jun 2007.
[87] K. Wu, E. Otoo, and K. Suzuki, “Optimizing two-pass connected component labeling algo-
rithms,” Pattern Analysis and Application, vol. 12, no. 2, pp. 117–135, Feb 2009.
[88] E. Wan and R. Van Der Merwe, “The unscented kalman filter for nonlinear estimation,”
in Adaptive Systems for Signal Processing, Communications, and Control Symposium, 2000,
pp. 153–158.

You might also like