A_survey_on_vision-based_UAV_navigation
A_survey_on_vision-based_UAV_navigation
net/publication/322441239
CITATIONS READS
338 13,957
4 authors, including:
All content following this page was uploaded by Gui-Song Xia on 22 February 2018.
To cite this article: Yuncheng Lu, Zhucun Xue, Gui-Song Xia & Liangpei Zhang (2018) A
survey on vision-based UAV navigation, Geo-spatial Information Science, 21:1, 21-32, DOI:
10.1080/10095020.2017.1420509
OPEN ACCESS
However, monocular cameras are not able to obtain system are optical flow methods and feature tracking
depth map (Rogers and Graham 1979). A stereo cam- methods.
era is actually a pair of the same monocular cameras Generally, we can divide the optical flow techniques
mounted on a rig, which means that it provides not only into two categories: global methods (Horn and Schunck
everything that a single camera can offer but also some- 1981) and local methods (Lucas and Kanade 1981). As
thing extra that benefits from two views. Most impor- early as 1993, Santos-Victor et al. (1993) invented a
tantly, it can estimate depth map based on the parallax method imitating the bee’s flight behavior by estimating
principle other than the aid of infrared sensors (Seitz et the object movement through cameras on both sides of a
al. 2006). RGB-D cameras can simultaneously obtain robot. First, it calculates the optical velocity of two cam-
depth map and visible image with the aid of infrared eras relative to the wall, respectively. If they are same, the
sensors, but they are commonly used in indoor envi- robot moves along the central line; otherwise, the robot
ronment due to the limited range. Fisheye cameras are moves along the speed of small places forward. However,
a variant of monocular cameras which provide wide it is prone to have a poor performance when navigating
viewing angle and are attractive for obstacle avoidance in texture-less environment. Since then, we have wit-
in complex environments, such as narrow and crowded nessed great development of optical flow approaches
space (Matsumoto et al. 1999; Gaspar, Winters, and and also made several breakthroughs in detection and
Santos-Victor 2000). tracking fields. Recently, a novel approach has been pro-
The rest of the paper is organized as follows: First, in posed for scene change detection and description using
Section 2, we introduce three different kinds of visual optical flow (Nourani-Vatani et al. 2014). Moreover, by
localization and mapping methods. Next, in Section incorporating inertial measurement unit (IMU) with
3, we provide a review on the obstacle detection and the optical flow measurements (Herissé et al. 2012),
avoidance technique in autonomous flight. Then, in researchers have achieved hovering flight and landing
Section 4, we focus on the path and trajectory planning maneuvre on a moving platform. With dense optical
approaches. Finally, in Section 5, we make a conclusion, flow calculation, it can even detect movements of all
together with further discussion on specific challenges moving objects (Maier and Humenberger 2013), which
and future trends of vision-based UAV navigation. plays an important role in high level tasks, such as sur-
veillance and tracking shooting.
2. Visual localization and mapping The feature tracking method has become a robust
and standard approach for localization and mapping. It
Considering the environment and prior information primarily tracks invariant features of moving elements,
used in navigation, visual localization, and mapping including lines, corners, and so on and determines the
systems can be roughly classified into three categories: movement of an object by detecting the features and
mapless system, map-based system, and map-building their relative movement in sequential images (Cho et al.
system (Desouza and Kak 2002) (Figure 3). 2013). During the process of robot navigation, invari-
ant features that have been previously observed in the
2.1. Mapless system environment are likely to be reobserved from different
perspectives, distances, and different illumination condi-
Mapless system performs navigation without a known tions (Szenher 2008). They are highly suitable for naviga-
map, and UAVs navigate only by extracting distinct tion. Traditionally, natural features used in localization
features in the environment that has been observed. and mapping are not dense enough to avoid obstacles.
Currently, the most commonly used methods in mapless Li and Yang (2003) proposed a behavioral navigation
method, which utilized a robust visual landmark rec- capacity to certain extents. Therefore, researchers have
ognition system combining with a fuzzy-based obstacle been showing more interest in the usage of simple (sin-
avoidance system for obstacle avoidance. gle and multiple) cameras rather than the traditional
complex laser radar and sonar, etc.
2.2. Map-based system One of the earliest attempts at the map-building
technique using a single camera was carried out by the
Map-based system predefines the spatial layout of envi- Stanford CART Robot (Moravec 1983). Subsequently,
ronment in a map, which enables the UAV to navigate an interest operator algorithm was improved to detect
with detour behavior and movement planning ability. 3D coordinates of images. The system basically demon-
Generally, there are two types of maps: octree maps and strated 3D coordinates of objects, which were stored
occupancy grids maps. Different types of maps may con- on a grid with 2 m cells. Although this technology can
tain varying degrees of detail, from the 3D model of reconstruct the obstacles in the environment, it is still
complete environment to the interconnection of envi- incapable of modeling large-scale world environment.
ronmental elements. After that, under the goal of simultaneously recover-
Fournier, Ricard, and Laurendeau (2007) used a 3D ing poses of cameras and structure of the environment,
volumetric sensor to efficiently map and explore urban vision-based SLAM algorithms, which are based on
environments with an autonomous robotic platform. cameras, have made considerable progress, and derived
The 3D model of the environment is constructed using three types of methods: indirect method, direct method,
a multi-resolution octree. Hornung et al. (2013) devel- and hybrid method, according to its way of visual sensor
oped an open source framework for representation of 3D image processing.
environment models. The main idea here is to represent
the model using octree, not only the occupied space, but 2.3.1. Indirect method
also the free and unknown space. Furthermore, the 3D Instead of using images directly, the indirect method
model is compacted using an octree map compression firstly detects and extracts features from images, and
method, which allows the system to efficiently store and then takes them as inputs for motion estimation and
update the 3D models. localization procedures. Typically, features are expected
Gutmann, Fukuchi, and Fujita (2008) used stereo to be invariant to rotation and viewpoint changes, as well
vision sensor to collect and process data, which can then as robust to motion blur and noise. For the last three
be used to generate a 3D environment map. The core of decades, a comprehensive study of feature detection and
the method is an extended scan line grouping approach description has been undertaken and there have been
to precisely segment the range data into planar segments, various types of feature detectors and descriptors (Li
which can effectively deal with the data noise generated and Allinson 2008; Tuytelaars and Mikolajczyk 2008).
by the stereo vision algorithm when estimating depth. Hence, current SLAM algorithms are mostly under the
Dryanovski, Morris, and Xiao (2010) used a multi-vol- feature-based framework.
ume occupancy grid to represent 3D environments, Davison (2003) presented a top-down Bayesian
which explicitly stores information about both obsta- framework for single camera localization with real-time
cles and free space. It also allows us to correct previous performance via mapping a sparse set of natural features.
potentially erroneous sensor readings by incrementally It is a milestone for monocular visual SLAM and has a
filtering and fusing in new positive or negative sensor great impact on future work. Klein and Murray (2007)
information. proposed a parallel tracking and mapping algorithm,
which is the first one to divide the SLAM system into
2.3. Map-building system two parallel independent threads: tracking and map-
ping. This has almost been the standard of modern
Sometimes, due to environmental constraints, it is dif- feature-based SLAM system. Mahon et al. (2008) pro-
ficult to navigate with a preexisting accurate map of the posed a novel approach for large-scale navigation. Visual
environment. Moreover, in some emergent cases (such loop-closure detections are used to correct the drifts of
as disaster relief), it would be impractical to obtain a trajectories which are common in long-term and large-
map of the target area in advance. Thereby under such scale environment. Celik et al. (2009) presented a visual
circumstances, building maps at the same time as flight SLAM-based system for indoor environment navigation,
would be a more attractive and efficient solution. where the layout is unknown and without the aids of
Map-building system has been widely used in both GPS. UAVs are only equipped with a single camera to
autonomous and semi-autonomous fields, and is becom- estimate the state and range. The core of the naviga-
ing more and more popular with the rapid development tion strategy is to represent the environment via ener-
of visual simultaneous localization and mapping (visual gy-based feature points and straight architectural lines.
SLAM) techniques (Strasdat, Montiel, and Davison Harmat, Trentini, and Sharf (2015) proposed a visual
2012; Aulinas et al. 2008). Nowadays, UAVs are getting attitude estimation system for multiple cameras based
much smaller than before, which limits their payload on multi-camera parallel tracking and mapping (PTAM)
GEO-SPATIAL INFORMATION SCIENCE 25
(Klein and Murray 2007). It combines the ego motion using indirect methods, and then continuously refines
estimation from multiple cameras, and parallelizes the camera poses by direct methods, which is faster and
pose estimation and mapping modules. The authors more accurate.
also proposed a novel external parametric calibration Forster, Pizzoli, and Scaramuzza (2014) innovatively
for cameras with non-overlapping fields of view using proposed a semi-direct algorithm, SVO, to estimate the
multiple cameras. state of a UAV. Similar to PTAM, motion estimation and
However, most of the indirect methods extract only point cloud mapping are implemented in two threads.
distinct feature points from images, which can at most For motion estimation, a more accurate motion esti-
reconstruct a specific set of points (traditionally, cor- mation is obtained by directly using the pixel bright-
ners). We call this kind of methods sparse indirect ness and gradient information, and combining with the
method, which can only reconstruct a sparse scene map. alignment of the feature points and minimization of the
So currently, researchers have been looking forward to reprojection error. Then, Forster et al. (2015) carried
the dense indirect methods that can reconstruct dense out a computationally efficient system for real-time 3D
maps. Valgaerts et al. (2012) used a dense energy-based reconstruction and landing-spot detection for UAVs, in
method to estimate the fundamental matrix and find which only smartphone processors are used as the pro-
dense correspondences from it. Ranftl et al. (2016) pro- cessing unit. Different from PTAM, in order to achieve
duced a dense depth map from two consecutive frames real-time performance, SVO needs to run with a high
using segmented optical flow field, which means that the frame rate camera. It was designed mainly for onboard
scene can be reconstructed densely under this frame- applications which have limited computation resources.
work by optimizing a convex program.
2.3.4. Multi-sensor fusion
2.3.2. Direct method Since laser scanners can easily access 3D point clouds
Though indirect methods prove to perform well in ordi- with relatively good quality, they are very popular in
nary environment, they are prone to get stuck in tex- ground mobile robots (Bonin-Font, Ortiz, and Oliver
ture-less environment. So, direct methods also become 2008). And with its size getting smaller, more UAVs can
a hot spot in the last decade. Different from indirect also equip with laser scanners. This enables a fusion of
methods, direct method optimizes geometry parameters different types of measurements from different kinds of
using all the intensity information in the image, which sensors. Benefiting from the timeliness and complemen-
can provide robustness to photometric and geometric tarity of multi-sensor, the fusion of multi-sensor enables
distortions present in images. Apart from that, direct more accurate and robust estimation of state of a UAV.
methods usually find dense correspondences, so it can Lynen et al. (2013) presented a general purpose mul-
reconstruct a dense map at an extra cost of computation. ti-sensor fusion extended Kalman filter (MSF-EKF) that
Silveira, Malis, and Rives (2008) proposed a novel can handle different types of delayed measurement sig-
method for camera poses and scene structure estimation. nals from multiple sensors, and provides a more accu-
It directly uses image intensities as observations, utiliz- rate attitude estimation for UAV control and navigation.
ing all information available in images, which is more Magree and Johnson (2014) proposed an integrated nav-
robust than indirect methods in the environment with igation system, which combines both visual SLAM and
little feature points. Newcombe, Lovegrove, and Davison laser SLAM with an EKF-based inertial navigation sys-
(2011) presented a real-time monocular SLAM algo- tem. The monocular visual SLAM finds data association
rithm, DTAM, which also estimates the camera’s 6DOF and estimates the state of UAVs, while the laser SLAM
motion using direct methods. By frame-rating the whole system performs scan-to-map matching under a Monte
image alignment, it can generate dense surfaces at frame Carlo framework.
rate based on the estimated detailed textured depth maps
with current commodity GPU hardware. Engel, Schöps, 3. Obstacle detection and avoidance
and Cremers (2014) employed an efficient probabilistic
direct approach to estimate semi-dense maps, which Obstacle avoidance is an indispensable module of auton-
can then be used for image alignment. Different from omous navigation, since it can detect, provide essential
methods that optimize parameters without scale, LSD information of nearby obstacles, and reduce risks of col-
SLAM (Engel, Schöps, and Cremers 2014) uses a pose lision as well as pilots’ operation errors. So, it can greatly
graph optimization over Sim(3) (Kümmerle et al. 2011), increase the autonomy of UAVs.
which explicitly takes scale factor into account, allowing The basic principle of obstacle avoidance is to detect
for scale drift correction and loop closure detection in obstacles and figure out the distances between the UAV
real time. and obstacles. When an obstacle is getting closer, the
UAV is supposed to avoid or turn around under the
2.3.3. Hybrid method instructions of obstacle avoidance module. One of the
Hybrid method combines the direct and indirect meth- solutions is to measure distance using range finders,
ods together. First, it initializes feature correspondences such as radar, ultrasonic, and IR, etc. Nevertheless, they
26 Y. LU ET AL.
are incapable of getting enough information in com- Table 1. Summary of important methods in localization and
plex environment due to their limited field of view and mapping.
measurement range. Compared to those issues, visual Authors Types Methods Sensors
sensors can get abundant visual information, which can Horn and Mapless Optical flow Single camera
Schunck (1981)
be processed and used for obstacle avoidance. Lucas and Kanade Mapless Optical flow Single camera
There are mainly two kinds of methods for obsta- (1981)
Santos-Victor et Mapless Optical flow Stereo cameras
cles avoidance: optical flow-based methods and SLAM- al. (1993)
based methods. Gosiewski, Ciesluk, and Ambroziak Herissé et al. Mapless Optical flow Multi-sensors
(2011) avoided the obstacles by means of image pro- (2012)
Maier and Mapless Optical flow Single camera
cessing. Based on optical flow, it is able to generate local Humenberger
information flow and obtain image’s depth. Al-Kaff et al. (2013)
Nourani-Vatani et Mapless Optical flow Single camera
(2016) detected the change of obstacle size during flight. al. (2014)
This method simulates the mechanism of the human Li and Yang Mapless Feature tracking Single camera
(2003)
eyes that objects in the field of view are getting bigger as Szenher (2008) Mapless Feature tracking Single camera
you are closer. Based on this principle, it can detect the Cho et al. (2013) Mapless Feature tracking Single camera
Fournier, Ricard, Map-based Octree map Depth camera
obstacle by comparing the sequential images, and find and Lauren-
out whether the obstacle is getting closer. deau (2007)
A variety of bionic insect vision optical flow navi- Hornung et al. Map-based Octree map Stereo cameras
(2013)
gation methods have also been proposed. Inspired by Gutmann, Fuku- Map-based Occupancy grid Stereo cameras
bees’ vision, (Srinivasan and Gregory 1992) proposed a chi, and Fujita map
(2008)
simple non-iterative optical flow method for measuring Dryanovski, Map-based Occupancy grid Depth camera
the global optical flow and self-motion of the system. As Morris, and Xiao map
(2010)
a basic local motion detection unit, the Reichardt model Moravec (1983) Map-building Multi-sensor Multi-sensors
(Haag, Denk, and Borst 2004) is inspired by the visual fusion
Davison (2003) Map-building Indirect; sparse Single camera
nerve structure of insects. Ruffier et al. (2003) designed Klein and Murray Map-building Indirect; sparse Single camera
the flow strategy and the sensor based on the compound (2007)
structure of flies. These are the insect vision algorithms Mahon et al. Map-building Indirect; sparse Single camera
(2008)
used in UAVs to apply theoretical validation. Recently, Celik et al. (2009) Map-building Indirect; sparse Single camera
inspired by insect vision (Bertrand, Lindemann, and Harmat, Trentini, Map-building Indirect; sparse Multi cameras
and Sharf
Egelhaaf 2015), a physics student Darius Merk proposed (2015)
a method that judges the distance between objects only Valgaerts et al. Map-building Indirect; dense Single camera
(2012)
by the speed of light (https://techxplore.com/news/2016- Ranftl et al. (2016) Map-building Indirect; dense Single camera
07-drone-obstacles-insect.html#nRlv). It is simple but Silveira, Malis, Map-building Direct; dense Single camera
and Rives
efficient, because lots of insects in nature can detect (2008)
the surrounding obstacles by light intensity. During the Newcombe, Map-building Direct; dense Single camera
flight, their image motion on the retina produces light Lovegrove, and
Davison (2011)
flow signal, this optical flow for the insect visual navi- Engel, Schöps, Map-building Direct; semi- Single camera
gation provides wealthy information of spatial charac- and Cremers dense
(2014)
teristics. So, insects can quickly figure out whether the Forster, Pizzoli, Map-building Hybrid; sparse Single camera
obstacles can be passed safely, according to the intensity and Scaramuz-
za (2014)
of the light passing through the leaf gap. Forster et al. Map-building Hybrid; sparse Single camera
However, optical flow-based method cannot acquire (2015)
Lynen et al. Map-building Multi-sensor Multi-sensors
precise distance, which may limit the usage in some spe- (2013) fusion
cific missions. By contrast, SLAM-based methods can Magree and John- Map-building Multi-sensor Multi-sensors
provide precise metric maps with a sophisticated SLAM son (2014) fusion
to be effective with real-time performance in GPS- include heuristic searching methods and a series of
denied environments. Esrafilian and Taghirad (2016) intelligent algorithms.
put forward a method based on oriented fast and rotated
brief SLAM (ORB-SLAM). First, it processes video data, 4.1.1. Heuristic searching methods
by computing 3D locations of the UAV and generating a A-star algorithm is a typical heuristic search method,
sparse point cloud map. Then, it enriches the spare map which evolved from the classic Dijkstra algorithm.
to more dense. Finally, it can generate a collision-free In recent years, the A-star algorithm has been greatly
road map by applying potential field method and quickly developed and derived lots of other improved heuris-
exploring the random tree (RRT). tic search methods. Vachtsevanos et al. (1997) used an
orographic database to build a digital map and used a
modified A-star algorithm to search for the best track.
4. Path planning
Rouse (1989) divided the whole region into several
Path planning is an important task in the UAV naviga- square grids, and used the heuristic A-star algorithm to
tion, it means finding an optimal path from the starting achieve optimal path planning, which is based on the
point to the target point, based on some performance value function of different grid points along the calcu-
indicators (such as the minimum cost of work, the lated path. Szczerba et al. (2000) presented the sparse
shortest flying time, the shortest flying route). And A-star search (SAS) for path planning, and this algo-
during this process, the UAV needs to avoid obstacles. rithm effectively reduces the computation complexity
According to the type of environment information uti- by adding constraints to space searching during path
lized to compute an optimal path, this problem can be planning. Stentz (1994) developed the dynamic A-star
divided into two types: global path planning and local algorithm, which is also known as D-star algorithm for
path planning. Global path planning aims to find an partially or completely unknown dynamic environment.
optimal path based on a priori global geographical map. It is capable of updating the map from unknown envi-
However, global path planning is not enough to control ronments and replanning the path when detecting new
a UAV in real time, especially when there are some other obstacles on its path. The sampling-based path planning
tasks to be done immediately or unexpected obstacles algorithm, such as the rapidly exploring random trees
appearing during the flight. Therefore, the local path (RRT) proposed by Yershova et al. (2005), can keep
planning is in need so that it constantly acquires sen- motion path planning from failure when there is no
sors’ information from surrounding environment, and prior information of the environment provided.
computes the collision-free path in real time. An illus-
tration of the two path planning methods is shown as 4.1.2. Intelligent algorithms
Figure 4. In recent years, researchers tried to use intelligent algo-
rithms to solve global path planning problems, and pro-
4.1. Global path planning pose lots of intelligent searching methods. Among those,
the most popular intelligent algorithms are genetic algo-
Global path planner requires the start and target loca- rithm and simulate anneal arithmetic (SAA) algorithm.
tions within a constructed map to calculate an initial In (Zhang, Ma, and Liu 2012), the genetic algorithm and
path, so the global map is also called a static map. The SAA methods are applied into the study of path plan-
commonly used algorithms for global path planning ning. The adaptation function of the path is evaluated
28 Y. LU ET AL.
Table 3. Summary of important methods in path planning. fitness function, genetic operation, and control param-
Authors Types Methods eters. Many literatures use the genetic algorithm as air-
Vachtsevanos et al. (1997) Global A-star search craft path planning solution (Pellazar 1998).
Szczerba et al. (2000) Global Sparse A-star search Neural network is a computational method estab-
Stentz (1994) Global Dynamic A-star search
Yershova et al. (2005) Global Rapidly-exploring Random lished under the revelation of biological functions.
Trees Gilmore and Czuchry (1992) gave an example of a path
Andert and Adolf (2009) Global Simulated annealing
Zhang, Ma, and Liu (2012) Global Simulate Anneal Arithmetic
planning using Hopfield networks. The ant colony algo-
Gilmore and Czuchry (1992) Local Hopfield networks rithm is a new kind of bionic algorithm that mimics the
Sugihara and Suzuki (1996) Local Artificial potential field ant activity (Parunak, Purcell, and O’Connell 2002). As
Bortoff (2000) Local Artificial potential field
Parunak, Purcell, and O’Con- Local Ant colony algorithm a stochastic optimization method, it imitates the behav-
nell (2002) ioral characteristics of ants, so it could achieve results
through a solution of a series of difficult combinatorial
optimizations.
using crossover and mutation operation in genetic algo-
rithm and Metropolis criterion, which improve the effi- 5. Conclusions
ciency of path planning. In (Andert and Adolf 2009),
the improved simulated annealing algorithm and the With the rapid development of computer vision and the
conjugate direction method are used to optimize the growing popularity of small UAVs, the combination of
global path planning. them has been an active area of research (Szeliski 2010).
This paper mainly introduces the vision-based naviga-
tion of UAVs from three aspects: localization and map-
4.2. Local path planning
ping, obstacle avoidance and path planning. Localization
Local path planning is based on the local environment and mapping is the key of autonomous navigation, which
information and UAVs’ own state estimation, and aims also provides location and environmental information
to dynamically plan a local path without collision. Due for UAVs. Obstacle avoidance and path planning are
to the uncertain factors, such as the movements of essential for the UAV to safely and quickly get to the
objects in the dynamic environment, the path planning target location without collision.
in the dynamic environment becomes a high complexity Even though UAVs are sharing similar navigation
problem. In this case, the path planning algorithms are solution with ground mobile robots, we are still facing
required to be adaptive to the dynamic characteristics many challenges when it refers to vision-based UAV nav-
of the environment, by obtaining information (such as igation. The UAV needs to process amount of sensors’
the size, shape, and location) about unknown parts of information in real time in order to fly safely and steady,
the environment through a variety of sensors. especially for image processing which greatly increase
Traditional local path planning methods consist of the computational complexity. So it has become a major
spatial search methods, artificial potential field meth- challenge a UAV to navigate under constraints of low
ods, fuzzy logic methods, neural network methods, and power consumption and limited computing resources.
so on. Several typical local path planning methods are Besides, UAV navigation requires a global or local
reviewed below. 3D map of the environment; extra dimension means
Artificial potential field method is a kind of virtual greater computation and storage consumption. So there
force method proposed by Sugihara and Suzuki (1996), is a great challenge when a UAV is navigating in a large-
whose basic idea is to move a robot from the surround- scale environment for a long time. In addition to that,
ing environment into an abstract artificial gravitational motion blur caused by fast movement and rotation can
field environment. The target point has the “attraction” easily result in tracking and localization failure during
as well as the obstacle with “repulsion” to the mobile the flight. Thereby, future research on loop detection and
robot, so the robot is controlled by these two forces and relocalization are expected.
gradually moves toward the target point. Bortoff (2000) Given a partial or complete 3D map, we are required
gave an example of the use of the artificial potential field not only to find a collision-free path, but also to mini-
method for calculating the path through the radar threat mize the travel length or energy consumption. Unlike
area. 2D path planning, the difficulties in a 3D map increase
Genetic algorithm provides a general framework for exponentially with the growing complexity of dynamic
solving complex optimization problems, especially for constraints and kinematic constraints of UAVs. So there
calculating an optimal path. It follows the inheritance are no common solutions to this NP-hard problem, and
and evolution of biological phenomena. According to modern path planning algorithms still suffer from the
the principle of “survival competition and survival of local minimum problem. Therefore, a more robust and
the fittest,” the problems can be solved to obtain an global optimization algorithm is still under research.
optimal solution. It is composed of five main compo- To date, the problem of vision-based UAV navi-
nents, including chromosome coding, initial population, gation, which uses visual sensors as the only external
GEO-SPATIAL INFORMATION SCIENCE 29
sensors for dynamic, complex and large-scale environ- Zhucun Xue is a postgraduate at State Key Laboratory of
ments, remains to be solved and is a prosperous area of Information Engineering in Surveying, Mapping and Remote
research (Jones 2009; Márquez-Gámez 2012; Mirowski Sensing (LIESMARS), Wuhan University, China. She received
the BS degree in Electronic Information School from Wuhan
et al. 2017). University in 2017. Her research interests include monocular
Considering the trends that sensors are getting visual SLAM and deep learning.
smaller as well as more precise, UAVs can be equipped
with multiple types of sensors. However, problems often Gui-Song Xia is a professor at State Key Laboratory of
occur when it comes to fusing different types of sen- Information Engineering in Surveying, Mapping and Remote
Sensing (LIESMARS), Wuhan University, China. He received
sor data, which have different noise characteristics and
the BS degree in electronics engineering and the MS degree
poor synchronization. Despite this, benefiting from mul- in signal processing from Wuhan University, Wuhan, China,
ti-sensor data fusion, we are expected to obtain better in 2005 and 2007, respectively, and the PhD degree in image
pose estimation, which can greatly improve the perfor- processing and computer vision from the Centre National
mance of navigation (Shen et al. 2014). Currently, as the de la Recherche Scientifique (CNRS) Laboratoire Traitement
inertial measurement unit is getting smaller and cheaper, et Communication de l’Information (LTCI), TELECOM
ParisTech, Paris, France, in 2011. Since April 2011, he had
fusing IMU and visual measurement together is gaining worked as a postdoctoral researcher at CNRS (LTCI and
much more attention (Leutenegger et al. 2015). CEREMADE) for about 1.5 years. His current research con-
We have also found that the limited power and per- centrates on mathematical modeling of images, computer
ception ability make it infeasible for a single UAV to vision, pattern recognition, and their applications in remote
complete certain types of tasks. So far, with the improve- sensing imaging. On those topics, he has co-/authored more
than 90 articles on international journals/conferences. He
ment of autonomous navigation, it is becoming possible
serves currently as an associate editor of EURASIP Journal on
for multiple UAVs to complete such tasks together (Maza Image and Video Processing and Area Editors of the journal
et al. 2011; Han and Chen 2014). Signal Processing: Image Communications, and Guest Editor
Although we have made some impressive progress in of IEEE Trans. on Big Data, Pattern Recognition Letter, Geo-
vision-based navigation, there are still many problems to Spatial Information Science, etc. He is a senior member of
be solved before a fully autonomous navigation system IEEE and also serves as the associate chair of Wuhan Chapter
of IEEE Signal Processing Society (SPS).
coming true, such as autonomously avoiding obstacles,
generating an optimal path in dynamic scenarios, and Liangpei Zhang received the BS degree in physics from
updating and assigning tasks dynamically (Roberge, Hunan Normal University, Changsha, China, in 1982, the
Tarbouchi, and Labonté 2013). MS degree in optics from the Xi’an Institute of Optics and
Finally, we make a list of the most important algo- Precision Mechanics, Chinese Academy of Sciences, Xi’an,
China, in 1988, and the Ph.D. degree in photogrammetry and
rithms referenced in this survey. Table 1 enumerates the
remote sensing from Wuhan University, Wuhan, China, in
algorithms of localization and mapping highlighted in 1998. He is currently a Chang-Jiang Scholar Chair Professor
this survey. Table 2 enumerates the methods in obstacle at Wuhan University appointed by the Ministry of Education
detection and avoidance. Table 3 enumerates the meth- of China. He has published more than 500 research papers
ods in path planning. and five books. He also holds 15 patents. His research
interests include hyperspectral remote sensing, high-reso-
lution remote sensing, image processing, and artificial intel-
Funding ligence. He is a fellow of the Institution of Engineering and
Technology, an executive member (Board of Governors) of
This work was supported by the National Natural Science the China National Committee of International Geosphere-
Foundation of China [grant number 61771350]. It was also Biosphere Programme, and an executive member of the
partially supported by the Open Research Fund of Key China Society of Image and Graphics. He regularly serves as
Laboratory of Space Utilization, Chinese Academy of Sciences a co-chair of the series SPIE Conferences on Multispectral
[grant number LSU-SJLY-2017-01]; and the Open Research Image Processing and Pattern Recognition, Conference
Fund of State Key Laboratory of Tianjin Key Laboratory of on Asia Remote Sensing, and many other conferences. He
Intelligent Information Processing in Remote Sensing [grant edits several conference proceedings, issues, and geoinfor-
number 2016-ZW-KFJJ02]. matics symposiums. He also serves as an associate editor of
International Journal of Ambient Computing and Intelligence,
Notes on contributors International Journal of Image and Graphics, International
Journal of Digital Multimedia Broadcasting, and Journal of
Remote Sensing. He is currently serving as an associate editor
Yuncheng Lu is a postgraduate at State Key Laboratory of
of IEEE Transactions on Geoscience and Remote Sensing.
Information Engineering in Surveying, Mapping and Remote
Sensing (LIESMARS), Wuhan University, China. He received
the BS degree in Electronic Information School from Wuhan
University in 2016. He majors in photogrammetry and ORCID
remote sensing and his research interests include monocular Gui-Song Xia http://orcid.org/0000-0001-7660-6090
and RGB-D SLAM and their applications in UAVs. Liangpei Zhang http://orcid.org/0000-0001-6890-3650
30 Y. LU ET AL.
Hornung, A., K. M. Wurm, M. Bennewitz, C. Stachniss, and Systems, IEEE/RSJ International Conference on, Kyongju,
W. Burgard. 2013. “OctoMap: An Efficient Probabilistic South Korea, October17–21.
3D Mapping Framework Based on Octrees.” Autonomous Maza, I., F. Caballero, J. Capitán, J. R. Martínez-de-Dios,
Robots 34 (3): 189–206. and A. Ollero. 2011. “Experimental Results in Multi-UAV
How, J. P., B. Behihke, A. Frank, D. Dale, and J. Vian. Coordination for Disaster Management and Civil Security
2008. “Real-time Indoor Autonomous Vehicle Test Applications.” Journal of Intelligent & Robotic Systems 61
Environment.” IEEE Control Systems 28 (2): 51–64. (1): 563–585.
Hrabar, S. 2008. “3D Path Planning and Stereo-based Obstacle Mirowski, P., R. Pascanu, F. Viola, H. Soyer, A. Ballard,
Avoidance for Rotorcraft UAVs.” Paper Presented at the A. Banino, and M. Denil. 2017. “Learning to Navigate
Intelligent Robots and Systems, IEEE/RSJ International in Complex Environments.” The 5th International
Conference on, Nice, France, September 22–26. Conference on Learning Representations, Toulon, France,
Jones, E. S. 2009. “Large Scale Visual Navigation and April 24–26.
Community Map Building.” PhD diss., University of Moravec, H. P. 1983. “The Stanford Cart and the CMU
California at Los Angeles. Rover.” Proceedings of the IEEE 71 (7): 872–884.
Klein, G., and D. Murray. 2007. “Parallel Tracking and Moreno-Armendáriz, M. A., and H. Calvo. 2014. “Visual SLAM
Mapping for Small AR Workspaces.” Paper Presented at and Obstacle Avoidance in Real Time for Mobile Robots
the Mixed and Augmented Reality, 6th IEEE and ACM Navigation.” Paper Presented at the Mechatronics, Electronics
International Symposium on, Nara, Japan, November and Automotive Engineering (ICMEAE), IEEE International
13–16. Conference on, Cuernavaca, Mexico, November 18–21.
Kümmerle, R., G. Grisetti, H. Strasdat, K. Konolige, and W. Newcombe, R. A., S. J. Lovegrove, and A. J. Davison. 2011.
Burgard. 2011. “g2o: A General Framework for Graph “DTAM: Dense Tracking and Mapping in Real-time.”
Optimization.” Paper Presented at the Robotics and Paper Presented at the Computer Vision (ICCV), IEEE
Automation, IEEE International Conference on, Shanghai, International Conference on, Washington, DC, USA,
China, May 9–13. November 6–13.
Langelaan, J. W., and N. Roy. 2009. “Enabling New Missions Nourani-Vatani, N., P. Vinicius, K. Borges, J. M. Roberts,
for Robotic Aircraft.” Science 326 (5960): 1642–1644. and M. V. Srinivasan. 2014. “On the Use of Optical Flow
Leutenegger, S., S. Lynen, M. Bosse, R. Siegwart, and P. for Scene Change Detection and Description.” Journal of
Furgale. 2015. “Keyframe-Based Visual–inertial Odometry Intelligent & Robotic Systems 74 (3–4): 817.
Using Nonlinear Optimization.” The International Journal Parunak, H. V., M. Purcell, and R. O’Connell. 2002. “Digital
of Robotics Research 34 (3): 314–334. Pheromones for Autonomous Coordination of Swarming
Li, J., and N. M. Allinson. 2008. “A Comprehensive Review UAV’s.” Paper Presented at the 1st UAV Conference,
of Current Local Features for Computer Vision.” Portsmouth, Virginia, May 20–23.
Neurocomputing 71 (10): 1771–1787. Pellazar, M. B. 1998. “Vehicle Route Planning with
Li, H., and S. X. Yang. 2003. “A Behavior-based Mobile Robot Constraints Using Genetic Algorithms”. Paper Presented
with a Visual Landmark-recognition System.” IEEE/ASME at the Aerospace and Electronics Conference, NAECON,
Transactions on Mechatronics 8 (3): 390–400. 1998, IEEE National, Dayton, USA, July 17.
Lucas, B. D., and T. Kanade. 1981. “An Iterative Image Ranftl, R., V. Vineet, Q. Chen, and V. Koltun. 2016. “Dense
Registration Technique with an Application to Monocular Depth Estimation in Complex Dynamic
Stereo Vision.” Paper Presented at the DARPA Image Scenes.” Paper Presented at the Computer Vision and
Understanding Workshop, 7th International Joint Pattern Recognition, IEEE Conference on, Las Vegas,
Conference on Artificial Intelligence, Vancouver, Canada, USA, June 27–30.
August 24–28. Roberge, V., M. Tarbouchi, and G. Labonté. 2013.
Lynen, S., M. W. Achtelik, S. Weiss, M. Chli, and R. Siegwart. “Comparison of Parallel Genetic Algorithm and Particle
2013. “A Robust and Modular Multi-Sensor Fusion Swarm Optimization for Real-time UAV Path Planning.”
Approach Applied to Mav Navigation.” Paper Presented at IEEE Transactions on Industrial Informatics 9 (1): 132–141.
the Intelligent Robots and Systems, IEEE/RSJ International Rogers, B., and M. Graham. 1979. “Motion Parallax as an
Conference on, Tokyo, Japan, November 3–7. Independent Cue for Depth Perception.” Perception 8 (2):
Magree, D., and E. N. Johnson. 2014. “Combined Laser and 125–134.
Vision-Aided Inertial Navigation for an Indoor Unmanned Rouse, D. M. 1989. “Route Planning Using Pattern Classification
Aerial Vehicle.” Paper Presented at the American Control and Search Techniques.” Aerospace and Electronics
Conference, IEEE, Portland, USA, June 4–6. Conference, IEEE National, Dayton, USA, May 22–26.
Mahon, I., S. B. Williams, O. Pizarro, and M. Johnson- Ruffier, F., S. Viollet, S. Amic, and N. Franceschini. 2003. “Bio-
Roberson. 2008. “Efficient View-based SLAM Using inspired Optical Flow Circuits for the Visual Guidance of
Visual Loop Closures.” IEEE Transactions on Robotics 24 Micro Air Vehicles.” Paper Presented at the Circuits and
(5): 1002–1014. Systems, IEEE International Symposium on, Bangkok,
Maier, J., and M. Humenberger. 2013. “Movement Detection Thailand, May 25–28.
Based on Dense Optical Flow for Unmanned Aerial Santos-Victor, J., G. Sandini, F. Curotto, and S. Garibaldi.
Vehicles.” International Journal of Advanced Robotic 1993. “Divergent Stereo for Robot Navigation: Learning
Systems 10 (2): 146. from Bees.” Paper Presented at the Computer Vision and
Márquez-Gámez, D. 2012. “Towards Visual Navigation Pattern Recognition, IEEE Computer Society Conference
in Dynamic and Unknown Environment: Trajectory on, New York, USA, June 15–17.
Learning and following, with Detection and Tracking Seitz, S. M., B. Curless, J. Diebel, D. Scharstein, and R.
of Moving Objects.” PhD diss., l’Institut National des Szeliski. 2006. “A Comparison and Evaluation of Multi-
Sciences Appliquées de Toulouse. view Stereo Reconstruction Algorithms.” Paper Presented
Matsumoto, Y., K. Ikeda, M. Inaba, and H. Inoue. 1999. at the Computer Vision and Pattern Recognition, IEEE
“Visual Navigation Using Omnidirectional View Computer Society Conference on, New York, USA, June
Sequence.” Paper Presented at the Intelligent Robots and 17–22.
32 Y. LU ET AL.
Shen, S., Y. Mulgaonkar, N. Michael, and V. Kumar. 2014. IEEE Transactions on Aerospace and Electronic Systems 36
“Multi-Sensor Fusion for Robust Autonomous Flight (3): 869–878.
in Indoor and Outdoor Environments with a Rotorcraft Szeliski, R. 2010. Computer Vision: Algorithms and
MAV.” Paper Presented at the Robotics and Automation Applications. London: Springer Science & Business Media.
(ICRA), IEEE International Conference on, Hong Kong, Szenher, M. D. 2008. “Visual Homing in Dynamic Indoor
China, May 31–June 7. Environments.” PhD diss., University of Edinburgh.
Silveira, G., E. Malis, and P. Rives. 2008. “An Efficient Direct Tuytelaars, T., and K. Mikolajczyk. 2008. “Local Invariant
Approach to Visual SLAM.” IEEE Transactions on Robotics Feature Detectors: A Survey.” Foundations and Trends in
24 (5): 969–979. Computer Graphics and Vision 3 (3): 177–280.
Srinivasan, M. V., and R. L. Gregory. 1992. “How Bees Exploit Vachtsevanos, G., W. Kim, S. Al-Hasan, F. Rufus, M. Simon,
Optic Flow: Behavioural Experiments and Neural Models D. Shrage, and J. Prasad. 1997. “Autonomous Vehicles:
[and Discussion].” Philosophical Transactions of the Royal From Flight Control to Mission Planning Using Fuzzy
Society of London B: Biological Sciences 337 (1281): 253– Logic Techniques.” Paper Presented at the Digital Signal
259. Processing, 13th International Conference on, IEEE,
Stentz, A. 1994. “Optimal and Efficient Path Planning for Santorini, Greece, July 2–4.
Partially-Known Environments.” Paper Presented at Valgaerts, L., A. Bruhn, M. Mainberger, and J. Weickert.
the IEEE International Conference on Robotics and 2012. “Dense versus Sparse Approaches for Estimating the
Automation, San Diego, CA, USA, May 8–13. Fundamental Matrix.” International Journal of Computer
Strasdat, H., J. M. Montiel, and A. J. Davison. 2012. “Visual Vision 96 (2): 212–234.
SLAM: Why Filter?” Image and Vision Computing 30 (2): Yershova, A., L. Jaillet, T. Simeon, and S. M. Lavalle. 2005.
65–77. “Dynamic-Domain RRTs: Efficient Exploration by
Strohmeier, M., M. Schäfer, V. Lenders, and I. Martinovic. Controlling the Sampling Domain.” Paper Presented
2014. “Realities and Challenges of Nextgen Air Traffic at the IEEE International Conference on Robotics and
Management: The Case of ADS-B.” IEEE Communications Automation, Barcelona, Spain, April 18–22.
Magazine 52 (5): 111–118. Zhang, Q., J. Ma, and Q. Liu. 2012. “Path Planning Based
Sugihara, K., and I. Suzuki. 1996. “Distributed Algorithms Quadtree Representation for Mobile Robot Using Hybrid-
for Formation of Geometric Patterns with Many Mobile Simulated Annealing and Ant Colony Optimization
Robots.” Journal of Robotic Systems 13 (3): 127–139. Algorithm.” Paper Presented at The Intelligent Control
Szczerba, R. J., P. Galkowski, I. S. Glicktein, and N. Ternullo. and Automation (WCICA), 10th World Congress on,
2000. “Robust Algorithm for Real-time Route Planning.” IEEE, Beijing, China, July 6–8.