MP T4MPEM U0041751 0613853 MAJA ARNE 56460360 2021 2022
MP T4MPEM U0041751 0613853 MAJA ARNE 56460360 2021 2022
Position estimation of an
autonomous robot based on
visual odometry
For autonomous cleaning robots used on photovoltaic panels
Arne MA JA
Bimal SAPKOTA
Supervisor(s): Prof. Peter Slaets, Antje Master Thesis submitted to obtain the degree
Goossens of Master of Science in Engineering
Co-supervisor(s): Co-founder Nico Vansina Technology: Electromechanical Engineering
Without written permission of the supervisor(s) and the author(s) it is forbidden to reproduce or adapt in
any form or by any means any part of this publication. Requests for obtaining the right to reproduce or
utilise parts of this publication should be addressed to KU Leuven, Campus GROUP T Leuven, Andreas
Vesaliusstraat 13, 3000 Leuven, +32 16 30 10 30 or via e-mail [email protected]
A written permission of the supervisor(s) is also required to use the methods, products, schematics and pro-
grams described in this work for industrial or commercial use, and for submitting this publication in scientific
contests.
Acknowledgements
First and foremost, we would like to thank our supervisor, professor Peter Slaets, for guiding us
through this thesis.
Second, we would like to thank our daily supervisor, Antje Goossens, for assisting us during the
whole year.
Third , we would like to express our gratitude to Nico Vansina for providing us with materials,
knowledge and feedback during this process.
We would also like to thank Robin Amsters for helping us with the experiments.
Finally, we would like to thank our friends and families for their support.
Arne Ma Ja
Bimal Sapkota
May 2022
iii
Samenvatting
Met de stijgende interesse voor het plaatsen van zonnepanelen voor huishoudelijke toepassingen,
stijgt ook de vraag naar het reinigen van deze zonnepanelen om een goed rendement en een
aantrekkelijk uiterlijk van het huis te behouden. Deze zonnepanelen worden typisch geı̈nstalleerd
op de daken waardoor ze moeilijk toegankelijk zijn. Om een oplossing te bieden worden er volledig
autonome reinigingsrobots ontwikkeld. Om zulke robots operationeel te maken op een zonnepa-
neel is een goede schatting van de positie van de robot van groot belang. Deze thesis onderzoekt
op welke manier de positie schatting op basis van wiel encoders, dewelke gevoelig zijn voor slip-
pen, en IMU verbeterd kan worden. Door sensor fusie toe te passen tussen een optical flow sensor
en de wiel encoders en de IMU, is het mogelijk deze positie nauwkeuriger te schatten. Schatting
van de positie is verder verbeterd door randdetectie toe te passen aan de hand van een ultrasone
afstandssensor. De sensorfusie, die gebeurde aan de hand van Kalman filtering, bracht een signif-
icante verbetering aan de positie schatting. Gemiddeld levert het sensor fusie model een vier maal
betere positie schatting t.o.v. een model gebaseerd op wiel encoder data en IMU data.
Sleutelwoorden: Zonnepaneel, autonoom, reiniging, robot, optical flow, odometry, Kalman filter.
iv
Abstract
With increasing interest to install solar panels on residential buildings, the demand for cleaning
these solar panels in order to maintain a good efficiency and aesthetic, is also increasing. As solar
panels are often installed on rooftops that are hard to reach, fully autonomous cleaning robots are
being developed. For the operation of such robots on a solar panel, a good estimation of their posi-
tion is required. This thesis investigates how a position estimation based on wheel encoders (which
is prone to errors from wheel slippage) and an IMU, can be improved. Sensor fusion is performed
between an optical flow sensor, wheel encoders and an IMU to create a more accurate position
estimation. Position estimation is further improved via edge detection using an ultrasonic distance
sensor. The sensor fusion, applied through the use of Kalman filtering, significantly increased the
accuracy of the position estimation. On average, the sensor fusion performs four times better than
an estimation based on wheel encoder data and IMU readings.
Keywords: Solar panel, autonomous, cleaning, robotics, optical flow, odometry, Kalman filter.
v
Contents
Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Samenvatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.3 Ecoppia: H4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
vi
2.4 Sensor fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.2 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
vii
A PAA5100JE and Raspberry Pi Zero W . . . . . . . . . . . . . . . . . . . A-1
E Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-7
E.4 Mean error and mean squared error comparison between estimations and refer-
ence trajectories . . . . . . . . . . . . . . . . . . . . . . . . . E-10
E.5 Initial performance test of the PAA5100JE optical flow sensor . . . . . . . . . E-11
viii
List of Figures
2.7 Trilateration to find the receivers position: receiver is at the cross-point of the four
meshed spheres. The radius of a meshed sphere is the distance to the receiver
measured by the satellite at the center of the sphere [7]. . . . . . . . . . . . . . . 7
2.8 The concept of robot localization using a landmark with known coordinates. The
landmark is indicated by the red dot. The red line represent the range between the
landmark and the robot.The position and the heading of the robot can be found using
triangulation [8],[9]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.9 A frame captured from a ceiling-mounted camera used for object tracking. . . . . . 9
2.10 Adafruit 9-DOF IMU Breakout consisting of 3-axis accelerometer, 3-axis gyroscope
and 3-axis magnetometer [10]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.11 Matched SIFT features (top) and filtered matches using RANSAC (bottom) [11] . . 11
2.13 Illustration of the aperture problem: the actual motion of the black lines is towards
the right but inside the aperture (green circle), the motion appears to be towards the
top right corner of the page [13]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.14 Lucas-Kanade algorithm makes use of a pyramid to deal with large motions [14] . . 13
2.15 Dense optical flow using Gunnar Farneback’s algorithm: hue (colour) corresponds to
the direction and value (lightness) corresponds to the magnitude [12]. . . . . . . . 13
2.17 Multiplication of 2 gaussians (blue and orange) leads to a gaussian with a smaller
variance (green) [16]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
ix
2.19 EKF linearization, and the effects of a non-linear function on a Gaussian [16] . . . 18
2.20 Comparison between Kalman and extended Kalman filter equations [16]. The changes
between filters are indicated by the boxed equations. . . . . . . . . . . . . . . . . 18
2.21 Visualization of the actual method, the EKF method, and the unscented transform
(UT) method [17]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.22 Two distributions with different values for α. Left: α = 0.3, right: α = 1. The first and
the second standard deviations (green ellipses) and the sigma points (blue circles).
[16] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Turtlebot3 burger equipped with an optical flow sensor and an ultrasonic distance
sensor. A red square on top for tracking. . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 Tracking a red square on a Turtlebot on a solar panel using a ceiling-mounted camera. 27
4.1 Two frames are captured with the camera displaced over a few centimeters to the
right. Features are found in both frames using the ORB method and they are matched
using the brute force matcher. Very few features are found and even these found fea-
tures cannot be matched properly because they are not distinct enough. . . . . . . 28
4.2 The matching of features is more accurate when the features are distinct. . . . . . 29
4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 Sparse optical flow using a webcam: reference path in red and measured path in
green . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.6 EKF applied to derive the pose from the wheel encoders, the optical flow sensor, and
the IMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.7 Bad EKF result: green curve is the optical flow sensor, dark blue is the prediction,
red is the encoder, turquoise is the ground truth, and yellow is the fused estimation.
The fused value deviates too much from the ground truth. . . . . . . . . . . . . . . 34
4.9 Turtlebot was given a constant velocity command of 2 cm/s from standstill. A straight
path was driven for a meter. Using the tracker, the instantaneous speeds were mea-
sured. An average speed of 2.4 cm/s (blue dotted line) with a standard deviation of
1.0 cm/s was found. The average is pulled a little higher than the actual value by the
outliers that can be seen on the plot. . . . . . . . . . . . . . . . . . . . . . . . . . 35
x
4.10 Implementation of an one-dimensional linear Kalman filter to fuse the oreintations
from the wheel encoders, the angular velocity command and the IMU. . . . . . . . 37
4.11 A smartphone compass is placed on top of the turtlebot. A rotation of 90◦ is made
with the turtlebot. The orientation from the encoders, the IMU, and the angular ve-
locity command are tested. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.12 Implementation of a 2D linear Kalman filter to fuse the x and the y position from the
1D Kalman filter with the x and the y position derived from the edge detection. . . . 38
4.13 Unprocessed frame from the webcam. Red box: light reflection, yellow box: calibra-
tion pattern, green box: turtlebot with a red square. . . . . . . . . . . . . . . . . . 39
4.14 Warped gray scale image from the calibration algorithm with the calibration sheet . 40
4.15 Unwarped gray scale image from the calibration algorithm with the calibration sheet 40
4.16 Unwarped image overlaid with a threshold mask to filter out the white colour . . . . 41
4.18 Output frame where the mask transform filter is applied to the solar panel to detect
the red square. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.19 Output frame of the tracker where the ROI is adaptive to the current turtlebot position. 43
4.20 Comparison between the unfiltered output of the tracker algorithm (red) and the fil-
tered output (blue) using the Savitsky-Golay filter. . . . . . . . . . . . . . . . . . . 44
5.1 Wheel encoders (red), wheel encoders + IMU (brown), reference (turquoise) . . . . 45
5.2 Wheel encoders (red), wheel encoders + IMU (brown), optical flow + IMU (green),
reference (turquoise) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Wheel encoders (red), wheel encoders + IMU (brown), optical flow + IMU (green),
velocity command prediction (dark blue), reference (turquoise), and fused (encoder
+ optical flow + IMU) (yellow) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
E.2 Finding noise covariances of the orientations derived from the wheel encoders, the
IMU, and the angular velocity command. . . . . . . . . . . . . . . . . . . . . . . . E-8
xi
E.3 Test 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-8
E.7 ME and MSE for the four plots given above . . . . . . . . . . . . . . . . . . . . . E-10
xii
List of Tables
3.2 Standard deviations of displacements [mm] and counts per mm, errors on displace-
ments [mm] and counts per mm, and means of counts per mm . . . . . . . . . . . 26
xiii
List of Symbols
Symbols
xiv
xˆ0 Initial state estimation of a system
xbot Previous estimated x position of the robot [cm]
xcmd Previous estimated x position from the velocity command [cm]
xenc Previous estimated x position from optical flow sensor readings [cm]
x f inal Final estimated x position [cm]
x f used Estimated x position as output from the sensor fusion [cm]
xl Estimated lower bound of the initial state of a system
xm,k X position of the estimated trajectory at timestamp k [cm]
xopt Previous estimated x position from optical flow sensor readings [cm]
xre f ,k X position of the reference trajectory at timestamp k [cm]
xstart Starting x position of the robot [cm]
xtrue,0 Initial true state of a system
xu Estimated upper bound of the initial state of a system
xultra Estimated x position from the edge detection [cm]
x′ New estimated x position [cm]
′
xbot New estimated x position of the robot [cm]
′
xcmd New estimated x position from the velocity command [cm]
′
xenc New estimated x position from encoder readings [cm]
′
xopt New estimated x position from optical flow sensor readings [cm]
y The residual
ybot Previous estimated y position of the robot [cm]
ycmd Previous estimated y position from the velocity command [cm]
yenc Previous estimated y position from encoder readings [cm]
y f used Estimated y position as output from the sensor fusion [cm]
y f inal Final estimated y position [cm]
ym,k Y position of the estimated trajectory at timestamp k [cm]
yopt Previous estimated y position from optical flow sensor readings [cm]
yre f ,k Y position of the reference trajectory at timestamp k [cm]
ystart Starting y position of the robot [cm]
yultra Estimated y position from the edge detection [cm]
y′ New estimated y position [cm]
y′bot New estimated y position of the robot [cm]
y′cmd New estimated y position from the velocity command from ROS [cm]
y′enc New estimated y position from encoder readings [cm]
y′opt New estimated y position from optical flow sensor readings [cm]
z The measurement mean
B The control input model
F State transition matrix (KF) and discrete state transition matrix (EKF)
H The measurement function (KF) and observation matrix (EKF)
I Unit matrix
K Kalman gain
ME Mean error [cm]
MSE Mean square error [cm2 ]
NEk Norm of the error on timestamp k [cm]
P Updated state covariance matrix
P̄ Predicted state covariance matrix
P0 Initial uncertainty matrix
xv
Q Covariance matrix of the prediction
R Covariance matrix of the measurement
Renc Covariance of the encoder measurement noise
Ropt Covariance of the optical flow sensor measurement noise
α Factor which determines spread of sigma points (UKF)
β Factor which represents the prior knowledge about the distribution (UKF)
∆e Incremental displacement derived from the wheel encoders [cm]
∆ f used Estimation of the incremental displacement as output of the sensor fusion [cm]
∆t Time interval between consecutive time stamps [s]
ε Error
εcpmm Absolute deviation from the average counts per millimeter [cpmm]
εdistance Absolute deviation from the mean for the measured distance [mm]
θcmd Previous estimated orientation from the velocity command [radians]
θenc Previous estimated orientation from the wheel encoders [radians]
θ f used Estimated orientation as output from the sensor fusion [radians]
θIMU Orientation from the IMU [radians]
θ′cmd New estimated orientation from the velocity command [radians]
θ′enc New estimated orientation from the wheel encoders [radians]
κ Secondary scaling factor based on the dimensions of the system (UKF)
µcpmm Average counts per millimeter [cpmm]
µm Micrometer
σ Standard deviation
σcpmm Standard deviation on the average counts per millimeter [cpmm]
σdistance Standard deviation on the measured distance [mm]
ωcmd Angular velocity command [rad/s]
xvi
Acronyms
xvii
1 INTRODUCTION
This chapter will discuss the scope of this paper as well as the research question and the outline.
The use of photovoltaic and photovoltaic thermal cells increases every year for industrial and do-
mestic applications. In order to properly use the photovoltaic cells, often referred to as PV cells
or solar panels, an optimal integration into their environment is key. Therefore, the PV cells come
in a large variety of sizes and shapes. Due to weather conditions, faeces (usually from birds) and
other pollution, PV cells can become dirty which influences their efficiency as less light can be
absorbed. This problem introduces the necessity to develop a cleaning operation. Solar panels
can be cleaned manually. However, this manual labour is not without risks as the panels are often
installed on sloped roofs. For large solar plants, this is even less appealing due to the large sur-
face of the plant. There are several commercial automated cleaning processes available, mainly
for industrial use. Usually, remotely controlled vehicles are used. Although, integrated systems in
the solar panels, or autonomous ground vehicles exist as well. However, there is no widespread
availability of these cleaning services for domestic use. That is where the Belgian startup, ART
ROBOTICS, comes into play. They aim to design a fully autonomous cleaning robot for small solar
panel installations. The robot should work on sloped panels and be able to cross the gap between
two panels.
Autonomous ground vehicles such as these cleaning robots are commonly equipped with an inertial
measurement unit (IMU) and wheel encoders. However, these sensors are unable to detect wheel
slippage and are influenced by growing errors during operation as they do not provide any absolute
measurement. Therefore, additional information is required. This thesis investigates the possibility
of implementing a commercially available visual odometry system to enhance the accuracy of the
position estimation of an autonomous ground vehicle on a PV cell. Depending on the results, the
use of this localization system could be extended to other applications such as painting robots, or
inspection robots.
The research question that this thesis tries to solve is ”How to improve position estimation of a fully
autonomous solar panel cleaning robot equipped with wheel encoders and an IMU?”.
1
1.3 Outline of the thesis
This paper studies the possibility of integrating a visual odometry sensor, in a position estimation
algorithm based on wheel encoders and an IMU. A literature study is conducted in chapter two,
investigating the need for automated processes for cleaning solar panels and exploring existing
applications. Different types of localization and sensor fusion methods are described. In chapter
three, a suitable sensor is selected. This sensor is put through preliminary tests to gauge its
accuracy and repeatability. Then, a test setup is described where this sensor works together with
an IMU and wheel encoders. This concludes chapter three. In chapter four, the development
of algorithms for sensor fusion and visual odometry is discussed. In chapter five, the results of
the made experiments are presented. These results are discussed in chapter six, followed by a
conclusion in chapter seven.
2
2 LITERATURE REVIEW
In this chapter, a literature study is conducted on the need for cleaning of solar panels, existing
cleaning applications, different localization methods for a robot, and sensor fusion.
Solar panels are used in the industrial and the residential domain. The PV cells are used to gener-
ate electricity directly while PV thermal cells are used to store heat which can be used for heating
applications.
In the industry, PV cells come forth in two main forms. The first is a large solar farm. This is usually
found in regions with a high day to nighttime ratio (around the equator) in order to maximise the
output. These solar farms usually work with PV thermal cells which generate steam. This steam
can be guided to a generator like the ones used in traditional power plants. These are not popular
in Europe. The use of PV cells to build zero-emission buildings for companies is more common
in European countries like Belgium. These buildings can be equipped with both PV as well as PV
thermal cells.
For domestic use, the amount and size of the PV cells which are used are much smaller. But, they
are installed in more complex environments like sloped roofs of houses. This adds an increased
difficulty to the cleaning process.
The efficiency of a single photovoltaic cell ranges from 10 to 46 % depending on the manufacturer
and the type of cell [18]. However, dust and dirt lower this efficiency. In Belgium, the efficiency can
drop by 3 to 4% after five weeks [19]. The dirt accumulation is also influenced by weather. With rain
and windy weather, the dirt could be washed off which is beneficial to the efficiency. However, this
effect is only true for larger particles (larger than 10 µm). Other types of contamination like moss,
leaves, and faeces are known to decrease the efficiency even further. The research suggests that
regular cleaning is beneficial [19].
There are some automated cleaning services for solar panels available. Most of these services
focus on the large PV cell installations. For the domestic applications, manual cleaning remains the
most popular method. Some of the available automated cleaning systems are presented.
3
2.2.1 hyCLEANER SOLAR facelift
The hyCLEANER SOLAR facelift cleaning robot (Figure 2.1) is a remote controlled robot using
radio signals with a range up to 100 meters that can work on a surface with a slope of maximum 25
degrees [1]. This robot is designed for large solar plants. Using up to two brushes of 1100 mm long,
the robot has a large working surface. This combined with a maximum speed of 1.5 km/h ensures
cleaning up to five times faster than manual methods. This robot works solely with rainwater, tap
water or technical water.
The GEKKO Solar cleaning robot (Figure 2.2) is another remote controlled robot using radio signals
with a range up to 300 m, that can work with a maximum surface angle of 45 degrees [2]. SERBOT
guarantees that their systems can clean up to four times as fast as manual methods. With a
maximum cleaning length of 1.250 m, they achieve a cleaning speed of 1040 m2 /h. Their robot is
equipped with a 100 m long hose to supply water. Their product is designed to clean smaller solar
farms with narrow solar panels.
4
2.2.3 Ecoppia: H4
Ecoppia’s latest design called the H4 is shown in Figure 2.3. This solution is different than most
other approaches as this robot is mounted onto a long array of solar panels. It is designed to
cover up to 2 km with a cleaning speed of 22 m/min at a tilt angle from 16 to 50 degrees [20].
Eccopia states that it can clean up to 1 km of a solar panel array within an hour. While most other
designs use water, this design uses a helix cleaning technology combined with controlled airflow.
It is especially designed for very large applications in remote locations as it uses no water and
powers itself with a mounted solar panel. These features combined with the remote control through
the internet guarantees a complete autonomous operation [20].
SolarCleano is a Belgian company that offers a few solar cleaning solutions. Besides robots closely
resembling the hyCLEANER SOLAR facelift and GEKKO Solar Hightec Robot, they also have a
more unique type namely the SolarBridge (Figure 2.4). Its key differences with other cleaning
applications are the self-adapting height and inclination, large brush variety and options to clean
with or without water [4]. It is completely autonomous and meant for utility scaled solar farms. Its
tilt angle ranges from 0 to 34 degrees and has a maximum cleaning speed of 40 m/min on even
ground. The SolarBridge is capable of operating for five to eight hours [21].
5
2.2.5 ART ROBOTICS: Project Helios (in development)
The HELIOS project from the startup ART ROBOTICS, tries to come up with a solution for the
problem of high variety in surfaces for residential PV cells [5]. HELIOS is the combination of a small
cleaning robot(s) that are brought onto the solar panels by a drone which tackles the first problem
of reachability in manual cleaning. Once placed on the solar panel, the robots can autonomously
clean the PV cells. When the cleaning process is complete, the drone will collect the cleaning
robot(s) [5]. Their solution offers a service which takes into account that costumers may not be at
home during the day and reduces the risks for people who clean their solar panels manually.
Achieving these goals requires a versatile system that can work on different kinds of slopes, clean
different kinds of pollution and has an accurate position estimation.
Different techniques can be applied to determine a vehicle’s position on ground. These techniques
can be divided in two main categories: techniques using relative measurements and techniques
using absolute measurements. A relative measurement is made with respect to the previous mea-
surement. An absolute measurement is made with respect to some known reference [22].
The best known technique for localization is probably the global positioning system (GPS) (Figure
2.6). It is a satellite based localization system which can determine the absolute position of a
vehicle on earth, based on trilateration applied to measured ranges from the satellites.
GPS is a part of Global Navigation Satellite System (GNSS), which is a constellation of satellites
that orbit around the earth. While GPS is owned by the United States of America, there are other
entities like Russia, China, Japan, Europe, and India who have their own systems for navigation,
positioning and timing [23].
6
Figure 2.6: Global positioning system [6]
The working principle is similar for these systems. The satellites emit radio signals that travel to
the earth at the speed of light. The signal contains a timestamp of the signal emission and the
satellite’s position. For an accurate timestamp, the satellite uses an atomic clock. Using the speed
of light and the time of flight, the distance between the receiver and the satellite can be solved.
Using two more satellites, the latitude, the longitude and the altitude of the receiver on earth can be
determined by applying trilateration (Figure 2.7). But to compensate for the fact that there is a time
delay, a fourth satellite is required [22],[24],[7]. With at least four satellites, the GPS can position a
receiver with approximately 7.0 meter accuracy, 95 % of the time, anywhere on or near the surface
of the earth [6]. The GPS loses its accuracy when no clear view of the sky is available. There
are techniques available which improve the accuracy of a GPS signal, such as using additional
sensors, using differential GPS, using pseudo-satellites and assisted GPS [22].
Figure 2.7: Trilateration to find the receivers position: receiver is at the cross-point of the four meshed
spheres. The radius of a meshed sphere is the distance to the receiver measured by the satellite
at the center of the sphere [7].
7
Techniques using landmarks
Another absolute approach for localization is the use of landmarks. The positions of these land-
marks are known. The position of the robot can be found from the measured ranges to the land-
marks using sensors, such as light detection and ranging (lidar) (Figure 2.8).
Lidar is a sensing method that consists of an emitting part and a receiving part. The emitter sends
out laser pulses which get reflected by an object in the environment. These reflections are detected
by the receiver of the lidar. The distance from the lidar to that object can be calculated using the
time of flight and the speed of light.
Besides time-of-flight (TOF) based lidar, there also exists frequency-modulated continuous-wave
(FMCW) lidar. This type of lidar can also measure velocity using the Doppler effect. They also have
a high signal-to-noise ratio. However, they are more expensive and complicated than TOF lidar.
Furthermore, they are not as mature in their development and applications as TOF lidar is [25].
Lidar is an accurate measuring technique that can work in low light conditions [26]. However, it
struggles to make correct measurements when there are translucent or transparent objects in the
environment, because light passes through them. A radio detection and ranging (radar) sensor,
which works with radio waves instead of laser, would be a more appropriate measuring technique
in that case.
Figure 2.8: The concept of robot localization using a landmark with known coordinates. The landmark
is indicated by the red dot. The red line represent the range between the landmark and the
robot.The position and the heading of the robot can be found using triangulation [8],[9].
8
Map based localization
When the map of the environment is known, the robot can be localized in an absolute manner.
The measurements from the sensors on board of the robot are used to create a map which is
then compared to a previously stored map. When there is a match between these two, the robot’s
position and orientation can be determined. Lidar and camera are among the suitable sensors for
use in this setup [22].
More recent developments allow the mapping and the localization to happen simultaneously. This
is called simultaneous localization and mapping (SLAM). Visual SLAM makes use of images from
cameras and lidar SLAM uses lidar measurements [27].
In indoor environments where the robot’s movement range is limited, the absolute position can also
be determined using a vision based tracking algorithm (Figure 2.9). This technique is applied and
discussed further in chapter 4.
Figure 2.9: A frame captured from a ceiling-mounted camera used for object tracking.
Wheel odometry
One of the most frequently used methods to localize a mobile robot is wheel odometry. It is an
inexpensive measurement technique. Wheel encoders are used to track the rotation of the wheels.
With a known wheel radius, the rotation can be transformed into a linear displacement. Incremental
displacements are integrated to find the overall movement of the robot [22],[28]. However, during
these incremental displacements small errors are accumulated. This leads to a wrong estimation
of the robot’s position. Other errors in the odometry can be caused by unequal wheel diameters,
misalignment of wheels, inaccurate measurement of the wheel base, limited encoder resolution
and sampling rate, and uneven and/or slippery floors leading to wheel slippage [9].
9
Inertial navigation
Another popular measurement technique is inertial navigation. Using a sensor setup such as in-
ertial measurement unit (IMU) (Figure 2.10), the position and the orientation of a robot can be
determined. An IMU consists of an accelerometer, a gyroscope and sometimes a magnetometer.
The accelerometer measures the accelerations, the gyroscope measures the angular rates and
the magnetometer measures the strength of the magnetic field from which the orientation can be
determined. When there is no magnetometer present, the angular rate is integrated to get the ori-
entation. Also by applying double integration to the measurements from the accelerometer, position
can be determined. However, these integrations on gyroscope and accelerometers are prone to
drift errors leading to a faulty estimations [29],[22].
Figure 2.10: Adafruit 9-DOF IMU Breakout consisting of 3-axis accelerometer, 3-axis gyroscope and 3-axis
magnetometer [10].
One of the relative position measurement techniques that is not affected by slip or skid, as is the
case with wheel odometry, is visual odometry (VO). VO is used to localize a robot using information
obtained from one or multiple cameras. It is used in wide range of applications, such as indoor
robots, drones and even on Mars Exploration Rovers.
Sequential frames obtained from the camera(s) are compared to each other in order to extract
information about the displacements of the camera(s). These displacements are integrated to in-
crementally determine the camera’s position as it moves. VO is a contactless measuring technique
that can work with inexpensive cameras and still deliver accurate results. However, visual odometry
is computationally expensive [30]. It also requires good lighting, enough texture and large enough
scene overlap between consecutive frames [31].
When a single camera is used to perform VO, it is called monocular visual odometry (MVO). When
multiple cameras are combined, it is called stereo visual odometry (SVO). MVO is cheaper and
easier to calibrate and deploy compared to SVO. The cameras in SVO need to be synchronised
and the calibration is more difficult than in MVO. However, MVO suffers from scale uncertainty,
whereas a stereo camera setup can determine scale using triangulation [30]. There are two main
types of visual odometry: feature-based VO and appearance-based VO.
10
Feature based VO
In feature-based VO, features are extracted in consecutive frames. Keypoints and local descriptors
are determined for all of the found features and the features are then matched and tracked across
multiple frames using these descriptors. Outliers are filtered out using algorithms such as the
random sample consensus algorithm (RANSAC) [31]. By tracking the remaining matches, the
motion can be determined.
Features in an image can be lines, curves, edges or simply an interesting shape. There are differ-
ent algorithms available to extract features from an image. The Harris Corner, Shi-Tomasi corner
detector, Scale Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF), Fea-
tures from Accelerated Segment Test (FAST), oriented FAST and rotated BRIEF (ORB) are some
of the most known feature detection algorithms [32].
A feature consists of a keypoint and a descriptor. The keypoint contains the 2d pixel position, as
well as the scale and the orientation of the feature if available [33]. The descriptor is a numerical
identification of the feature and is based on the attributes of the environment around the keypoint
location. This descriptor is used to find the same features across multiple frames in order to track
them [30],[12]. Most known algorithms to determine the descriptors are SIFT, SURF, Histogram of
Oriented Gradients (HOG), Binary Robust Independent Elementary Features (BRIEF), and ORB
[34],[32].
To find matches of features across different frames, algorithms such as Brute-Force Matcher and
FLANN (Fast Library for Approximate Nearest Neighbors) Matcher can be used [32]. However,
not all matches will be correct. Algorithms such as RANSAC can be used to filter out the outliers
(Figure 2.11).
Figure 2.11: Matched SIFT features (top) and filtered matches using RANSAC (bottom) [11]
Appearance based VO
Appearance based VO looks for changes in appearances in consecutive frames to determine mo-
tion [30]. Instead of tracking features, it tracks brightness patterns. This is also known as the optical
flow. To solve optical flow, there are two assumptions that need to be fulfilled. First, the brightness
of an image point needs to remain constant over time, at least across two consecutive frames.
Second, the motion of neighbouring pixels needs to be similar [12].
11
There are two approaches to optical flow: sparse optical flow and dense optical flow. Sparse optical
flow tracks brightness patterns of a small window of pixels, around given points. The most known
algorithm based on this technique is the Lucas-Kanade algorithm (Figure 2.12). Points of interest
are first found using corner detection algorithms, such as Shi-Tomasi Corner Detector and Harris
Corner Detector. Corners are good points to estimate flow because they have different directions
of gradient [35]. In contrast, an edge would cause an aperture problem (Figure 2.13). Regions with
low texture also give bad results since the gradients are too small there i.e. the patch is not distinct
enough for tracking.
Figure 2.12: Sparse optical flow using the Lucas-Kanade algorithm [12]
Figure 2.13: Illustration of the aperture problem: the actual motion of the black lines is towards the right but
inside the aperture (green circle), the motion appears to be towards the top right corner of the
page [13].
The brightness patterns around these points are then tracked using the Lucas-Kanade optical flow.
This algorithm tracks the intensity pattern of a 3x3 pixel window around the found points. To deal
with large movements, it makes use of a pyramid. The higher you go on the pyramid, the lower the
resolution of the image gets (Figure 2.14). This means that large motions turn into smaller motions
and the assumption of similar motion in a pixel patch is fulfilled again [12].
12
Figure 2.14: Lucas-Kanade algorithm makes use of a pyramid to deal with large motions [14]
Dense optical flow tracks the brightness patterns of all the pixels in a frame (Figure 2.15). This
makes the dense optical flow algorithm computationally more expensive than the sparse method
[12]. The most known algorithm used for this technique is the Gunnar Farneback’s algorithm. This
algorithm approximates all the pixel neighborhoods in two frames using quadratic polynomials. It
determines displacement fields from polynomial expansion coefficients [36].
Figure 2.15: Dense optical flow using Gunnar Farneback’s algorithm: hue (colour) corresponds to the direc-
tion and value (lightness) corresponds to the magnitude [12].
An example of a technology that uses optical flow is the optical mouse used for computers. It has
a very low resolution camera with a very high frame rate. The high frame rate reduces the effects
of speed changes on the measurements. The low resolution makes the computation faster.
In general, feature-based VO is faster and more accurate than appearance based VO as long as
there are enough features present [37]. It can also compute larger displacements. If there are not
a lot of features present in a scene, appearance based VO might be the better choice.
2.3.4 Conclusion
There are many techniques available for the localization of an autonomous mobile robot. They
come in two forms: absolute and relative measurement techniques. These techniques are often
used in combination with each other. For example, wheel encoders can be used in combination
with lidar, VO, and IMU to localize a mobile robot accurately. The process of combining data from
multiple sensors is called sensor fusion. It depends on the application whether it is useful to use a
certain measurement technique or not. For an autonomous robot moving on a solar panel, some of
the techniques discussed, are not usable. Since the dimensions of a solar panel are smaller than
13
the accuracy of a typical GPS, this technique is not suitable here. Furthermore, the use of Lidar
is also not reliable enough, because the presence of good landmarks is not always guaranteed on
a rooftop. Map based localization and object tracking algorithms are not an option for an outdoor
environment that varies every time. Wheel odometry, inertial navigation, and visual odometry are
the techniques that could be applied successfully for this application. These techniques can be
fused together to form a robust position estimation of a mobile robot.
The measurement techniques discussed above can be merged together to create a more accurate
estimation of a vehicle’s state. This merging is referred to as sensor fusion. Different or similar
sensors can be fused in three different ways [38].
The first approach is the competitive configuration. Measurements are fused to achieve higher
accuracy and reliability [38]. An example is the fusing of the displacement from an encoder with the
displacement from an optical flow sensor. Both the encoder and the optical flow sensor measure
the same thing, but combining them makes the belief in displacement estimation stronger.
The third configuration is the cooperative one. It requires multiple sensors of the same kind in order
to obtain data which would be unobtainable from just one sensor[38]. An example of this method
is the use of a stereo camera configuration to estimate the depth through triangulation. Another
example is the use of two wheel encoders to determine a robot’s orientation.
These configurations are not mutually exclusive. Many applications implement more than one of
these configurations together. The Kalman filters and particle filters are the most commonly used
methods for sensor fusion [38],[39].
There are a few methods which are the standard go-to methods to combine measurement data
for estimation of a system’s state. The Kalman filter is the most popular one. Besides this, the
particle filter is gaining more and more popularity as well. Each of these methods comes with its
own advantages and disadvantages.
The Kalman filter is a part of the Bayes filter family. A Bayes filter probabilistically estimates a dy-
namic system’s state from noisy observations [40]. Figure 2.16 demonstrates the working principle
of the Bayes filter.
14
A robot is moving in a hallway. It has a sensor to measure whether it is in front of a door or a
wall. In the beginning, the robot has no idea of its location in the hallway so the belief bel(x) for
every position is the same (Figure 2.16a). Then, a measurement p(z|x) is made and it states that
the robot is in front of a door (Figure 2.16b). Since there are three doors in the hallway, the robot
can be in front of any of those three doors. The belief is updated accordingly. The robot moves
forward (Figure 2.16c). The belief curves flatten because the motion itself causes uncertainty in
the estimation of the position. Another measurement is made (Figure 2.16d). This measurement
also states that the robot is in front of a door. Given the history of the belief, there is only one door
where the robot could be. So the belief for this location becomes very high compared to the other
locations. The robot moves forward again, causing the belief curves to flatten once more due to
uncertainty caused by motion (Figure 2.16e). Thus, just by taking two measurements and updating
the belief accordingly, the robots position is estimated fairly accurately.
The discrete Bayes filter works by multiplying and adding arbitrary probability distributions in order
to estimate the next state of the system [16]. It is multimodal, meaning that it can express strong
belief for more than one state estimate at a given point in time. Multimodal state estimators are
computationally expensive. This is not always practical for application in the real world. Often, one
15
single estimate is desired at a real-time speed. The Kalman filter is a variation on the discrete
Bayes filter where the arbitrary probability distributions are replaced by Gaussians. It is unimodal
which makes it very efficient.
A Gaussian is completely defined by its mean and covariance. The Kalman filter will approximate its
beliefs based on this mean and covariance [40]. These beliefs are the calculated uncertainties for a
given point in time. With the Kalman filter, it is possible to decrease the uncertainty of the estimated
state. For this to work, the uncertainties of the used sensors and the dynamic model need to
approximate Gaussian distributions. Furthermore, the dynamic model and the measurement model
must be linear. In that case, an one-dimensional Kalman filter can be understood as a multiplication
of multiple Gaussians (Figure 2.17). This principle can also be extended to multiple dimensions.
Figure 2.17: Multiplication of 2 gaussians (blue and orange) leads to a gaussian with a smaller variance
(green) [16].
The Kalman filter consists of three steps (figure 2.18). First step is the initialization. The initial
state estimate and the uncertainty of this estimate are set. The second step is called the predict
step. The dynamic model of the system and the control input are used to predict the next state and
its uncertainty. The third step is the update step, where a measurement is made. The residual,
the difference between the measured value and the predicted value, is calculated. Based on the
uncertainty of the predicted state and the uncertainty of the measurement, the Kalman gain is
defined. Using the Kalman gain, the residual, and the predicted state, the new state estimate is
obtained. The uncertainty of this new state estimate is updated based on the Kalman gain and the
uncertainty of the predicted state. So, the new estimated state is a weighted value between the
predicted state and the measured state. The Kalman gain is the weighting factor here and it lies
between zero and one. The closer it is to one, the more the filter trusts the measurement. If it is
closer to zero, the filter trusts the prediction.
16
Figure 2.18: Visualisation of the Kalman algorithm
When it is desired to estimate multiple states of a system, a multivariate Kalman filter is used. The
equations can then be written in state space form. Equation 2.1 gives the predict formulation and
equation 2.2 gives the update formulation for the multivariate Kalman filter
x̄ = Fx + Bu
(2.1)
P̄ = FPFT + Q
y = z − Hx̄
K = P̄HT (HP̄HT + R)−1
(2.2)
x = x̄ + Ky
P = (I − KH)P̄
with y the residual, z the measurement mean, H the measurement function, K the Kalman gain, P
the state covariance, x the state mean, R the measurement noise covariance, F the state transition
function, Q the process noise covariance, and u the control inputs of the system [16].
The design of the R and the Q matrices is very important but also complicated. Furthermore, no
sensors uncertainty is really a Gaussian. Before using the Kalman filter, one should make sure that
this simplification is justified [16].
The Kalman filter is a very popular tool in autonomous vehicle control systems due to its capability of
real-time performance. However, Kalman filter has one important downside: it is limited to working
with linear models. In reality, most systems are non-linear. Therefore, the extended Kalman filter is
implemented as it can handle non-linear systems [40].
When a Gaussian is propagated through a non-linear function, the output is no longer a Gaussian
(Figure 2.19b). Thus, the assumption of Gaussian distributions in Kalman filter is no longer fulfilled,
making this filter ineffective for non-linear models. The extended Kalman filter is able to handle non-
linear systems. It does so through linearization at the point of the current estimate, using the first
17
Taylor approximation (2.19a). The linear Kalman filter is then applied to this linearized system. It
was one of the first techniques used for non-linear systems, and it remains the most used technique
[16].
Figure 2.19: EKF linearization, and the effects of a non-linear function on a Gaussian [16]
In order to linearize the system, the EKF takes the Jacobian of the state transition function f (x, u)
and the observation model h(x̄), at the current state estimate. The Jacobian of f (x, u) gives F, the
discrete state transition matrix. The Jacobian of h(x̄) gives H, the observation matrix.
Taking these Jacobians into consideration, the multivariate Kalman filter equations can be trans-
formed into EKF equations as shown in figure 2.20.
Figure 2.20: Comparison between Kalman and extended Kalman filter equations [16]. The changes be-
tween filters are indicated by the boxed equations.
There is a more stable and numerically robust equation for the state covariance update, known as
the Joseph form [41]. The state covariance matrix is given by eq. 2.3.
This equation guarantees that the state covariance matrix is symmetric and positive as long as the
previous state covariance matrix is positive and symmetric.
The Kalman filter needs to be tuned adequately in order for it to function well. The parameters that
need to be considered are the measurement noise covariance R, the initial state estimate x̂0 , the
initial state covariance P0 , and the process noise covariance Q.
18
The measurement noise covariance R, will influence the weight that is given to the measurements.
Selecting an R too large might result in a divergence from the true states. Usually, the given infor-
mation from the manufacturer is used to determine R. The noise covariances of the measurements
are considered time-invariant most of the time. If this is the case, R is typically a cross diagonal
matrix with the standard deviations squared, multiplied by a safety factor k which is typically larger
than or equal to one [42].
The initial state estimate x̂0 and the initial uncertainty on that state P0 have to be chosen as well.
Defining P0 too large can result in the Kalman gain putting too much trust on the measurements.
Choosing P0 too low would make the Kalman gain very small which could cause the filter to di-
verge. The true state xtrue,0 can never be exactly known. The initial uncertainty P0 can also not be
measured. The rule of thumb is to define
Yet, here again the true state is required. As the true state xtrue,0 is not known, x̂0 − xtrue,0 can be
approximated by the following equation:
where xu and xl respectively are reasonably accurate upper and lower bounds on the initial state
[42]. Another parameter that has to be defined is the process noise covariance matrix Q. This
matrix is considered the hardest to define and is usually done through trial and error where a
constant diagonal Q is defined [42].
The unscented Kalman filter can be seen as the iterated version of the extended Kalman filter.
EKF handles non-linearities by linearizing around the current mean value. In contrast, the UKF
uses multiple points to linearize. Figure 2.21 visualizes the different approaches for propagation of
mean and covariance: the actual method, the EKF method, and the UKF method. In case of the
actual method, a lot of points are propagated to find the new mean and covariance. This method is
very computationally expensive which makes it impractical. The EKF method uses only the current
mean to estimate the next best Gaussian, which makes it very efficient. The UKF method uses
more than one point, but less points than the actual method.
19
Figure 2.21: Visualization of the actual method, the EKF method, and the unscented transform (UT) method
[17].
The points used in the UKF are called the sigma points. These are important points that represent
the current Gaussian. They are given a weight based on their position with respect to the mean.
There are many algorithms available that are used to generate good amount of sigma points. Yet,
since 2005, the most commonly used method was published by Rudolph Van der Merwe [16].
His formulation uses three parameters: α, β, and κ, which can be determined by the designer to
compute sigma points and their weights. α determines the spread of the sigma points. This is
visualized in figure 2.22. Note that the mean is also a sigma point. The larger the value for alpha,
the bigger the spread of the sigma points, and the bigger the weight assigned to the mean. A good
choice for the α parameter always lies between zero and one. β represents prior knowledge of the
distribution and is equal to two for Gaussian problems. κ is a secondary scaling factor and a good
choice is
κ = 3−n (2.7)
where n is the dimensionality of the system [16],[17]. Once the sigma points are computed and
their weights have been assigned, they are transformed through the non-linear function. A new
Gaussian is determined based on these transformed points. Finally, the mean and the covariance
of the new Gaussian can be determined. This principle is referred to as the unscented transform
(UT). The calculated mean and covariance are dependent on the amount of sigma points used.
A good balance between computational time and accuracy should be determined by the designer
[16].
20
Figure 2.22: Two distributions with different values for α. Left: α = 0.3, right: α = 1. The first and the second
standard deviations (green ellipses) and the sigma points (blue circles). [16]
The unscented Kalman filter has a similar working principle as the extended Kalman filter with the
initialize, the predict, and the update steps. It is commonly used for high non-linearities, while
the traditional extended Kalman filter remains the most used method for systems with lower non-
linearities [16]. The unscented Kalman filter usually achieves more accurate results than the ex-
tended Kalman filter, especially for highly non-linear systems. For less non-linear systems, the
accuracy is very similar. The drawback of the unscented Kalman filter is the higher computational
time compared to the EKF [16].
The particle filter is a type of ensemble filtering which is based on the Bayesian Monte Carlo tech-
nique [16]. This filter is designed to handle multinodal, highly non-linear systems. In contrast to
the Kalman filter, the particle filter works for any arbitrary, non-analytic probability distribution. By
combining (the ensemble) a large enough amount of particles per time step, an accurate approx-
imation of the distribution can be made. Using the ’importance sampling’ allows to compute the
probabilities even if the underlying probability distribution is not known [16]. The downside of the
particle filter is that the computational time is significantly higher than the Kalman filter. Besides
this, the filter is also fickle. The validation of this filter can be hard to prove and the designer has
to watch out for the degradation and divergence of particles [16]. Also, when working with multiple
objects to track, the particles have to be clustered and assigned to an object. When these objects
move close to each other, it can be hard to determine the paths of these intersecting or crossing
objects.
21
3 DESIGN
This chapter discusses the visual odometry sensor selection process, the isolated sensor test and
a more elaborate test setup where the sensor works in combination with other sensors. The goal
is to investigate whether the selected sensor is a good candidate to be used in combination with
wheel encoders and IMU to estimate a robot’s position.
There are many off-the-shelf visual odometry sensors available in the market today. A few of them
can be considered for the application of this thesis. There are two important requirements that the
selected sensor must fulfil.
Since the robot moves on a solar panel, the sensor needs to be reliable on reflective surfaces.
Furthermore, the sensor should be suitable for a ground vehicle i.e. it should perform well at low
ranges.
Sensors that are candidates for this sensor selection process are listed in table 3.1 with their range,
applications, advantages, and disadvantages.
Most of the listed sensors are designed for drones, which means that their minimum operating
height is likely high. The Pimoroni PAA5100JE breakout seems to be the most suitable sensor,
since it is designed for cleaning robots that operate on ground. It has a low minimum range and
can work on reflective surfaces like tiles and wet wooden floors. Furthermore it has a high frame
rate of 242 frames per second which is needed for optical flows assumption of similar motion of
neighbouring pixels. Therefore, this sensor is selected and tested in the further sections.
22
Table 3.1: Available off-the-shelf sensors
Low cost
ADNS3080 2.3-2.5 Mice for PC’s and game consoles Does not work on glass surfaces [49]
Interchangeable lens
23
3.2 Preliminary tests and results
It is unclear whether this sensor works with dense or sparse optical flow. However, frame captures
from the sensor give low resolution images (Figure 3.1). These do not seem suitable for corner
detection and therefore the sensor probably works with dense optical flow where flow is calculated
for every pixel.
Figure 3.1: Frame from PAA5100JE: low resolution image of a parquet floor
Initial experiments are conducted to determine the optical flow sensor’s repeatability. The measure-
ments are done under different lighting conditions and speeds to understand their effects on the
repeatability.
The sensor is mounted on a square plate made of medium-density fibreboard (MDF) (Figure 3.2).
Cardboard edges are added to remove the effects from the environmental light (Figure 3.2a). The
sensor comes with two bright LED’s from the manufacturer. An extra diffused light source (LED in
a semi-transparent straw) is added on top of the sensor (Figure 3.2b).
The MDF square plate can be moved on a rail, also made of MDF (Figure 3.3a). The maximum
possible displacement of the square’s center on the rail is 1 m. The sensor is connected to a Rasp-
berry Pi Zero according to the circuit diagrams given in appendix A.1, A.2 to set up communication
through SPI. SPI is a commonly used protocol for communication between integrated boards [51].
24
It is a four wire communication protocol where a master can communicate with one or more slaves.
The Raspberry Pi Zero is powered through an USB-connection by a laptop.
(a) Rail system to move the sensor for 1 m (b) Rail system design where the plate moves 1 meter
from the right to the left
Series of measurements are summarized in Table 3.2. These are made on a solar panel and a
parquet floor for a displacement of 1 m (Figure 3.4). Parameters to be considered are the lighting
(diffused vs direct), the speed (slow is on average 0.05 m/s and fast is on average 0.1 m/s), and the
type of the surface (solar panel or parquet). Also a scale (counts per mm) is determined for each
of the measurement series. The complete set of measurements can be found in appendix E.5.
Figure 3.4: Measurements were conducted on a solar panel and a parquet floor
The sensor performs overall better on a parquet floor than on a solar panel. All the displacement
measurements, except for fast movements on a solar panel, have similar standard deviation of
around 20 mm. This indicates that the sensor is fairly repeatable within a 2 cm margin for a slow
displacement of 1 m. Also the error on displacement on the solar panel is minimal when the
movement is slow and the lighting is diffused.
To conclude, this sensor has a decent repeatability when measuring displacement on a reflective
surface, as long as the speed is not too high. A solar panel cleaning robot has low speed. Thus,
this sensor can be used in combination with IMU and wheel encoders to localize such a robot on
a solar panel. Due to the minimal difference between direct and diffused lighting measurements
when operating on a solar panel, it is more convenient to utilize the built in LED’s rather than to
attach a second diffuse light source next to the sensor.
25
Table 3.2: Standard deviations of displacements [mm] and counts per mm, errors on displacements [mm]
and counts per mm, and means of counts per mm
slow diffused parquet fast diffused parquet slow direct parquet fast direct parquet
σdistance [mm] 17.16 20.82 18.51 17.96
εdistance [mm] 26.49 35.75 35.47 37.51
µcpmm [counts/mm] 31.24 32.36 28.50 28.93
σcpmm [counts/mm] 0.54 0.67 0.53 0.52
εcpmm [counts/mm] 0.83 1.16 1.01 1.09
slow direct solarpanel fast direct solarpanel slow diffused solarpanel fast diffused solarpanel
σdistance [mm] 20.91 25.25 20.06 27.93
εdistance [mm] 46.73 42.11 39.07 64.54
µcpmm [counts/mm] 20.29 19.96 15.46 14.67
σcpmm [counts/mm] 0.42 0.50 0.31 0.41
εcpmm [counts/mm] 0.95 0.84 0.60 0.95
To further analyse the performance of the optical flow sensor, it is mounted on a turtlebot three, type
burger (Figure 3.5a). This robot is used for conducting the final experiments. An ultrasonic distance
sensor is added for edge detection (Figure 3.5b). This sensor has an emitter and a receiver. An
ultrasonic wave is emitted which is reflected and received back. Using the speed of sound, distance
can be calculated based on the time-of-flight principle. The communication between the optical
flow sensor and the turtlebot happens once again with SPI, now with a Raspberry Pi three B plus
model (appendix A.3). Turtlebot is a differential wheeled mobile robot that operates with the robotic
operating system (ROS). It is driven by two wheels, each with their own encoder. The robot also
has a 360° lidar and an IMU that consists of a 3-axis gyroscope, 3-axis accelerometer, and a 3-axis
magnetometer. It uses a Raspberry Pi 3 Model B+ and an OpenCR for computing [52].
Figure 3.5: Turtlebot3 burger equipped with an optical flow sensor and an ultrasonic distance sensor. A red
square on top for tracking.
26
The goal is to implement sensor fusion between the wheel encoders, the IMU, and the optical flow
sensor. Lidar is left out since the solar panel cleaning robot will be working on roofs where presence
of landmarks is not always guaranteed. Ultrasonic distance sensor is used to define the boundaries
of the solar panel.
Figure 3.6: Tracking a red square on a Turtlebot on a solar panel using a ceiling-mounted camera.
27
4 DEVELOPMENT
To get a feel for how optical flow works, attempts were made to apply the techniques discussed in
the literature study. The OpenCV library for Python provides various tools that can be applied using
a simple webcam.
The robot moves on a solar panel, which does not have many distinct features. There are patterns
on the panel, but they all have the same appearance. This makes the feature based approach
unusable here. Figure 4.1 shows application of the ORB feature extractor and the brute force
feature matcher on 2 images of a solar panel surface. For comparison, same methods are applied
to two images of an environment with distinct features in figure 4.2.
Figure 4.1: Two frames are captured with the camera displaced over a few centimeters to the right. Features
are found in both frames using the ORB method and they are matched using the brute force
matcher. Very few features are found and even these found features cannot be matched properly
because they are not distinct enough.
28
Figure 4.2: The matching of features is more accurate when the features are distinct.
A webcam mounted on a cardboard box is used to capture frames (Figure 4.3a). A smartphone
provides the lighting. Corners are detected in the first frame using the Shi-Tomasi corner detector.
Then, optical flow is calculated for these corners using the Lucas-Kanade method. The algorithm
returns the location of the corners in the second frame (Figure 4.3b). Pixel displacements are
calculated and by integrating average value of the pixel displacements, the total linear displacement
is found. Pixel-per-centimeter values are found experimentally to convert the measurements to
centimeters (Table 4.1, Table 4.2). The Python script for this algorithm can be found in appendix
B.1.
Figure 4.3
29
Table 4.1: Calibration in X direction
Three measured trajectories are plotted in figure 4.4. As discussed before, optical flow assumes
constant brightness between at least two consecutive frames and similar motion of neighbouring
pixels. When these assumptions are violated, measurements are prone to large errors. Even a
slight change in lighting or speed, causes errors. These errors accumulate over time to give large
inaccuracies in the displacement measurements as can be observed in the figures below.
This algorithm demonstrates the working principle of sparse optical flow. However, it is not reliable
enough to be used as a stand-alone measurement technique for a real application on the solar
panel. Further processing and sensor fusion might provide more stable results.
30
(a) (b)
(c)
Figure 4.4: Sparse optical flow using a webcam: reference path in red and measured path in green
OpenCV also offers tools for applying dense optical flow to an image sequence. The Gunnar
Farneback’s algorithm is used to calculate the flow. Same procedure as in the sparse optical flow
algorithm is used to find the pixel-per-centimeter value. The source code can be found in appendix
B.2.
Since flow is calculated for every pixel, this algorithm is very slow for the standard frame size from
the webcam. To run this algorithm at real-time speed the frame must be cropped. A window of
160x160 pixels is chosen here. To get more robust results, high values should be chosen for the
size of the pixel neighborhood, and the size of the averaging window.
Figure 4.5 shows measurements for the same three paths that had been tested with the sparse
optical flow method. Also this method is very sensitive to lighting and speed changes. Besides
being very slow, this technique is also a relative measuring technique. Thus, sensor fusion is
required to compensate for the accumulation of errors.
31
(a) (b)
(c)
An optical flow sensor was selected in chapter 3.1. Since this sensor outputs more reliable mea-
surements than the developed algorithms in section 4.1, it is decided to fuse this sensor with the
wheel encoders and the IMU. For this purpose, attempts were made to apply the extended Kalman
filter in ROS (Figure 4.6). ROS works with a publisher-subscriber system that can work asyn-
chronously. So, the predict and the update steps of the Kalman filter are also executed asyn-
chronously. The pose of the robot was derived using the differential drive robot kinematics and
encoder measurements as shown in eq. 4.1.
32
Figure 4.6: EKF applied to derive the pose from the wheel encoders, the optical flow sensor, and the IMU
(d r − d l )
θ′enc = θenc +
wb
sin(θ′enc )(d r + d l )
y′enc = yenc +
2
Likewise, a pose was derived by combining the incremental displacement from the optical flow
sensor with the orientation from the IMU (eq. 4.2).
′
xopt = xopt + dopt cos(θIMU )
(4.2)
y′opt = yopt + dopt sin(θIMU )
For the prediction step of the EKF, linear and angular velocity commands were used. Assuming
constant velocities, the pose of the robot can be predicted as follows:
′
xcmd = xcmd + vcmd ∆tcos(θ′cmd ) (4.3)
33
This pose was then updated using eq. 4.1 and eq. 4.2. Both the wheel encoders and the optical
flow sensor were publishing measurements at 25 hertz. The linear and angular velocity commands
were published at 145 hertz. The noise covariances of the sensors were made dependent of the
total distance travelled to compensate for the error accumulation. The results from this filter were
unsatisfactory (Figure 4.7).
Figure 4.7: Bad EKF result: green curve is the optical flow sensor, dark blue is the prediction, red is the
encoder, turquoise is the ground truth, and yellow is the fused estimation. The fused value
deviates too much from the ground truth.
There could be many reasons for this failure. Wrong estimation of the covariance matrices is the
most frequently occurring error in practice. However, in this case, the data that is being fused does
not fully represent what the sensors are measuring. An optical flow sensor and wheel encoders
measure incremental displacements, not a pose. Therefore, what follows is an implementation of
an one-dimensional linear Kalman filter to fuse these incremental displacements (Figure 4.8). The
estimated incremental displacement, ∆ f used , is then implemented in the robot’s motion model.
Figure 4.8: Implementation of an one-dimensional linear Kalman filter to fuse incremental displacements
from the wheel encoders and the optical flow sensor
34
The equations for an one-dimensional linear Kalman filter are given by eq. 4.4. Prediction is once
more made using the linear velocity command published at an average of 45 hertz. The validity of
the constant speed model was tested using the tracking algorithm which will be discussed in the
next section (Figure 4.9). The measurements from the optical flow sensor and the wheel encoders
are published at 25 hertz. The state that is being estimated, is the incremental displacement, s (eq.
4.5).
Figure 4.9: Turtlebot was given a constant velocity command of 2 cm/s from standstill. A straight path was
driven for a meter. Using the tracker, the instantaneous speeds were measured. An average
speed of 2.4 cm/s (blue dotted line) with a standard deviation of 1.0 cm/s was found. The
average is pulled a little higher than the actual value by the outliers that can be seen on the plot.
Predict
x̄ = x + dx
P̄ = P + Q
U pdate (4.4)
y = z − x̄
K = P̄/(P̄ + R)
x = x̄ + Ky
P = (1 − K)P̄
To make a prediction of the next state, the linear velocity command is multiplied by the time passed
since the last call of the predict step . Q, the process noise covariance, is determined through trial
and error, and is chosen as 0.0025. This gives eq. 4.5 for the prediction step. The initial chosen
values for the state s, and the state covariance P, are zero and ten respectively.
35
s̄ = vcmd ∆t
(4.5)
P̄ = P + 0.0025
Both, the wheel encoders (eq. 4.7) and the optical flow sensor (eq. 4.8), update this prediction step
with their incremental displacements as the measurements. The incremental displacement from
the wheel encoders is denoted as ∆e, and is given by:
(dl + dr )
∆e = (4.6)
2
where dl and dr are the incremental displacements of the left wheel and the right wheel respectively.
The incremental displacement from the optical flow sensor is denoted as dopt . It is the value
returned by the optical flow sensor converted to centimeters using an experimentally determined
scale factor discussed in chapter 3.2.
y = ∆e − s̄
P̄
K=
P̄ + Renc (4.7)
s = s̄ + Ky
P = (1 − K)P̄
y = dopt − s̄
P̄
K=
P̄ + Ropt (4.8)
s = s̄ + Ky
P = (1 − K)P̄
Since both of these sensors return relative measurements, error accumulation is to be expected
over time. Therefore, the measurement noise covariances are chosen as functions of total dis-
placements measured by these sensors. Renc and Ropt are the measurement noise covariances of
the encoders and the optical flow sensor respectively. The noise covariance equations were de-
termined through series of tests, where standard deviation of the measurements were determined
for two different distances (appendix E.1). The difference in the standard deviations was divided by
the difference in the two distances to obtain eq.4.9
where enctotal is the total accumulated displacement derived from the encoders and opttotal is the
total accumulated displacement derived from the optical flow sensor. A constant term is added to
avoid the noise covariance from reaching a zero value.
36
Furthermore, the orientation from the IMU is also fused with the orientation derived from the velocity
command, and the wheel encoders using an one-dimensional Kalman filter (Figure 4.10).
Figure 4.10: Implementation of an one-dimensional linear Kalman filter to fuse the oreintations from the
wheel encoders, the angular velocity command and the IMU.
The prediction is performed at 45 hertz. The updates from the wheel encoders and the IMU are
performed at 25 hertz. The noise covariances are determined through some tests where a compass
from a smartphone is used as the reference (Figure 4.11). The tests can be found in appendix
E.2. The initial value for the orientation and the state covariance, P, are chosen as zero and one
respectively.
Figure 4.11: A smartphone compass is placed on top of the turtlebot. A rotation of 90◦ is made with the
turtlebot. The orientation from the encoders, the IMU, and the angular velocity command are
tested.
37
The fused orientation, θ f used , is implemented in the robot’s motion model together with the fused
incremental displacement, ∆ f used (eq. 4.10).
′
xbot = xbot + ∆ f used cos(θ f used )
(4.10)
y′bot = ybot + ∆ f used sin(θ f used )
To further constrain the position estimation, edge detection is added using ultrasonic distance sen-
sor. However, in order for this to work, the starting position of the robot, and the dimensions of the
solar panel must be known. In case of Helios, the camera on the drone could be used to find the
solar panel’s dimension and the position where the robot is dropped off on the solar panel. The x
and the y position of the robot derived above can be fused with the x and the y position estimated
from the edge detection, using a two-dimensional linear Kalman filter (Figure 4.12). This way the
uncertainty on the starting position of the robot and the solar panel’s dimensions can be taken into
account.
Figure 4.12: Implementation of a 2D linear Kalman filter to fuse the x and the y position from the 1D Kalman
filter with the x and the y position derived from the edge detection.
The algorithm is ready to be tested. Early tests show good results, therefore, experiments are
conducted in chapter 5 followed by a discussion of the results in chapter 6.
In order to measure the reference trajectory of the turtlebot, a tracking algorithm is developed using
a webcam which is mounted on the ceiling above the solar panel at a distance of 2.1 meters. The
algorithm uses the Opencv library available for IPython 3.8.
38
4.3.1 Defining an object to track
Frames are captured by the webcam and analysed using the OpenCV library. The goal is to find a
way to compare consecutive frames and derive the displacement of the turtlebot. Since the webcam
produces frames in RGB, it is be possible to use colour to determine an object to track. However,
the solar panel itself is dark blue which closely resembles the black colour of the turtlebot, making
the turtlebot not distinct enough from its environment. The solar panel also has white patterns.
Besides these patterns, the reflection of the light also appears to be white (Figure 4.13, red box).
So, the colour to be tracked should also be distinguishable from white. The colour red fulfills these
criteria. Therefore, a red square is placed on top of the turtlebot (figure 4.13, green box).
Figure 4.13: Unprocessed frame from the webcam. Red box: light reflection, yellow box: calibration pattern,
green box: turtlebot with a red square.
4.3.2 Calibration
The webcam produces frames of 480 by 640 pixels. The pixel coordinates of the center of the
red square is determined in the very first frame. This is then considered as the new origin of
the pixel coordinate frame. The pixel coordinates of the center of the red square is found for the
following frames. To get the pixel coordinates of the turtlebot with respect to the new origin, the
difference between the coordinates from the first frame, and the coordinates from the current frame
is calculated. This difference is also the total pixel displacement of the turtlebot in the x and the
y direction. To convert this pixel displacement to centimeters, a conversion scale is determined
by using a calibration pattern (figure 4.13, yellow box). This pattern has three rows of four black
circles. Each circle has a diameter of 5 cm. The circles next to each other are 6.5 cm (center to
center) apart. In order to calculate the scale, a calibration program is written. The Python script for
this program can be found in appendix C.1.
Calibration algorithm
The calibration algorithm detects the twelve black circles on the white paper. The coordinates of the
center points of these circles are stored in an array. The pixel distance between the center points
of the circles that lie next to each other, is determined. This same distance in centimeters is known
and is equal to 6.5 cm. Using this knowledge, the conversion scale can be determined. In order
39
to have a more correct scale, the distances between all the circles that are next to each other, are
used to calculate an average scale.
However, there is one more problem that needs to be solved before calculating the scale. The
webcam causes a fish-eye effect. The image needs to be flattened before determining the pixel
displacements. This unwarping is done using the cv2.warpPerspective() function. This function can
get rid of perspective deformations by transforming the warped image using a special matrix [53].
This special matrix is obtained using the cv2.getPerspectiveTrasform() function which determines a
perspective transform matrix based on four border points in the warped image. Using this method,
the image can be transformed from figure 4.14 to figure 4.15. The code for this operation can be
found in appendix C.1 line 65 to 87.
Figure 4.14: Warped gray scale image from the calibration algorithm with the calibration sheet
Figure 4.15: Unwarped gray scale image from the calibration algorithm with the calibration sheet
With the frames now flattened, the calibration process can begin. The algorithm extracts the black
circles from the image using a gray-scale threshold. This threshold will filter out the white colour
from the image. The result is shown in figure 4.16. Using the ’pointpos()’ function (appendix C.1
line 22 to 27), corner points for a region of interest (ROI) can be selected. This function is used
to define the edges of a calibration zone, so that the noise from the image (the dark blue colour of
the solar panel itself) has no influence on the detection of the circles. From this processed image,
the black circles can be extracted easily due to the high contrast between the white sheet and the
black circles .The code for this operation can be found in appendix C.1 line 104 and 105.
40
Figure 4.16: Unwarped image overlaid with a threshold mask to filter out the white colour
The next operation is the detection of the circles in the ROI. The function cv2.findContours() is able
to detect contours in an image. These contours are stored in an array as a list of points. From this
array, the function cv2.contourArea() is used to determine the contour area. Detected objects with
too large or too small contour area to be one of the calibration circles, are filtered out. The func-
tion cv2.arclength() computes a closed curve from the points calculated by the cv2.findContours()
function. From this curve, a more smooth approximation of the circle is computed by using the
cv2.approxPolyDP(). The result from this is used as input for the cv2.BoundingRect() function. This
function finds the bounding rectangle that encompasses the circle. It returns four values. These
four values represent the x and the y coordinate of the top left corner of the bounding rectangle,
the width, and the height of the bounding rectangle. Using this information, the center point of the
bounding rectangle and thus also the center point of the circle, is determined. This point will then
be stored in a list. The code for these operations can be found in appendix C.1 line 106 to 121.
The order in which the circles are detected by the algorithm are random. So, the rest of the algo-
rithm determines which circles are neighbours to each other. The determination is based on the
knowledge that the top left pixel of a frame is the origin of the frame. The circle with the lowest
pixel coordinates is the most upper left circle. Based on this circle, all the other circles can be found
based on their relative distances. The code for these operations can be found in appendix C.1 line
123 to 186.
Once the array containing the center points of the circles is built, the distances between center
points are computed and an average conversion scale is printed. The code for these operations
can be found in appendix C.1 line 187 to 217.
The calibration is executed four times on eight different regions on the solar panel as indicated on
figure 4.17.
41
Calibration results
The calibration algorithm ran in all the eight regions for four times delivered the results shown in
table 4.3.
Table 4.3: Calibration of the tracking algorithm
Since the standard deviation is small, it can be concluded that the conversion scale is stable enough
to be used for the entire solar panel. This conversion scale is used in the tracker algorithm in order
to calculate the displacement in centimeters.
With the calibration, and the definition of an object to track concluded, the tracking algorithm itself
can be built. The frames read from the webcam are once again processed to get rid of the fish-eye
effect. The cv2.findContours() function together with the calculation of the center point is performed
as discussed in the calibration process. Only now a red square is used instead of a black circle.
There are two new functions which the algorithm applies. The first function is called the mask
transform function. This function is a more advanced filter than the threshold filter that is applied in
the calibration algorithm. Here, the filter converts a RGB frame into an HSV colour frame. Using
the HSV colour frame, one can easily determine a range for the hue, saturation and value in order
to design a colour filter. This colour filter is then combined with a Gaussian blur and threshold
function. The Gaussian blur function is implemented to smooth out the red square as due to a low
amount of pixels per centimeter (due to our setups limitations) there could be some irregularities.
The result of this combination of filters is than ran through the cv2.Canny() function. The Canny
function is a mathematical computation which is able to find edges in frames based on a change
in neighbouring pixel values. These edges are represented in an array the size of the initial frame
filled with booleans. The positions where these booleans are true, are interpreted as a point of
an edge. The output from this canny function is then used in the cv2.findContours() function. It is
now possible to detect the red square anywhere on the frame and determine it’s current position as
shown in figure 4.18.
42
Figure 4.18: Output frame where the mask transform filter is applied to the solar panel to detect the red
square.
On figure 4.18, there is still some noise present on the right and the top edge of the frame caused
by the orange parquet floor. The colour filter cannot distinguish red from orange. Another noisy
spot is located at the left of the red square. This noise is due to the light emitted by the red LED’s
from the OpenCR circuit board. In order to prevent noise from influencing the detection of the red
square, another function is implemented. It is called the move ROI function. It will only search for
the red square on the new frame in close proximity to where the red square was in the previous
frame.
This function computes the area around the last found center point of the red square. Then instead
of running the whole frame through the mask transform function, only this region of interest is
filtered. This operation is visualized in figure 4.19. Since the turtlebot has fairly low speeds, the
ROI can be set fairly small as no large displacements between consecutive frames are expected.
The code for the tracker algorithm can be found in appendix C.2.
Figure 4.19: Output frame of the tracker where the ROI is adaptive to the current turtlebot position.
4.3.4 Validation
Before the tracking algorithm is used to estimate the reference trajectory, it is validated. A red
square is moved for 75 centimeters across the solar panel. This measurement is done seven
times. The results are summarized in table 4.4.
43
Table 4.4: Tracking algorithm validation
4.3.5 Conclusion
According to the results in table 4.4, the tracker has a very good precision and repeatability. How-
ever, slight change in lighting can cause the shape of the red square to vary between two frames.
This causes the center points of the squares to shift slightly. Furthermore, due to the low resolution
of the webcam, rounding errors occur when converting pixel values to centimeter values. Therefore,
the position extracted from the tracker has a precision of 0.3 centimeters. Since this precision is
sub-centimeter while the odometry sensors achieve a precision in the centimeter range, the tracker
is considered a good option to obtain a reference trajectory. The noise caused by the slight changes
in lighting is filtered using the Savitzky-Golay filter. This filter can be implemented in python using
the scipy library. The filter applied on the tracker uses the measured datapoints for x,y coordinates
to determine a trajectory based on 31 data points. As the goal is to remove noisy measurements,
the design decision is made that the filter should approximate a third order polynomial. This allows
enough flexibility while staying true to the actual position as shown in figure 4.20. The python code
for this filter can be found in appendix C.2 line 208 to 212.
Figure 4.20: Comparison between the unfiltered output of the tracker algorithm (red) and the filtered output
(blue) using the Savitsky-Golay filter.
44
5 EXPERIMENTS
In this chapter, experiments are performed where Kalman filter discussed in chapter 4.2 is applied.
First, the pose of the Turtlebot was derived using only the wheel encoders. This measurement
technique is heavily affected by slip that is ever-present in such a wheel driven robot. Using the
orientation from the IMU, this measurement can be slightly better as shown in figure 5.1. However,
this technique is not good enough to be used alone.
Figure 5.1: Wheel encoders (red), wheel encoders + IMU (brown), reference (turquoise)
An optical flow sensor combined with the orientation from the IMU performs slighly better, because
it is not sensitive to wheel slippage (Figure 5.2). However, this sensor also is not good enough to
be used alone, as accumulated errors cause the estimation to drift away.
45
Figure 5.2: Wheel encoders (red), wheel encoders + IMU (brown), optical flow + IMU (green), reference
(turquoise)
One-dimensional linear Kalman filter is applied to fuse the incremental displacements from the
wheel encoders, the optical flow sensor, and the linear velocity command. Another one-dimensional
Kalman filter is used to fuse the orientations from the wheel encoders, the IMU, and the angular
velocity command. Using motion equations for a differential drive robot, where the estimated in-
cremental displacement, and the estimated orientation are implemented, a trajectory is derived.
Figure 5.3 shows the trajectories of the turtlebot, estimated using different methods.
Figure 5.3: Wheel encoders (red), wheel encoders + IMU (brown), optical flow + IMU (green), velocity com-
mand prediction (dark blue), reference (turquoise), and fused (encoder + optical flow + IMU)
(yellow)
46
The Kalman gain for the fusion of incremental displacements is plotted in figure 5.4.
47
The residuals from the wheel encoders and the optical flow sensor are given in figure 5.6.
(a) Residual from the encoders (b) Residual from the optical flow sensor
Figure 5.6
To make a point-wise comparison between the reference trajectory and the other trajectories, linear
interpolations are performed using the NumPy library in Python. Errors are calculated for both the
x and the y directions (eq. 5.1). The norms of these errors are calculated (eq. 5.2). The mean of
these norms is determined. This is the mean error given by equation 5.3, where n is the number
of points after interpolation. The mean squared error can then be calculated as given in equation
5.4. The mean squared error can be used as a numerical indicator for how far a trajectory lies from
the reference trajectory [54]. The smaller the mean squared error, the closer a trajectory is to the
reference trajectory.
q
NEk = e2xk + e2yk (5.2)
1 n
ME = ∑ NEk (5.3)
n k=1
1 n
MSE = ∑ (NEk − ME)2
n k=1
(5.4)
The mean errors and the mean squared errors for this path are given in table 5.1. The ME and
MSE of few more tests can be found in appendix E.4.
48
Table 5.1: Mean errors and mean squared errors
Figure 5.7 shows a trajectory where the ultrasonic distance sensor is used to further correct the
trajectory when there is an edge detected.
The trajectory correction is visible when zoomed in, as shown in figure 5.8. The fused y position
jumps to the y position of the edge, from where the trajectory continues.
49
When the edge is detected, the covariance in y direction gets smaller as shown in figure 5.9. The
covariance ellipse is equally big in the x and the y direction before the edge. Once the edge is
detected, the covariance in the y direction gets smaller. The complete measurements can be found
in appendix E.6.
Figure 5.9: Covariance in the y direction gets smaller when an edge is detected.
50
6 DISCUSSION
The first test demonstrates that the fused trajectory comes closest to the reference trajectory com-
pared to the other trajectories. This means that the sensor fusion between the wheel encoders,
the optical flow sensor, and the IMU greatly improves the position estimation, compared to these
sensors used individually. This can also be seen numerically via the mean squared errors. The
MSE of the fused trajectory is smaller than the MSE of the other trajectories. On average, the fused
trajectory performs four times better than the encoder and IMU model. Few more measurements
can be found in appendix E.3.
The Kalman gain, K, decides whether the estimated state follows the measurements or the pre-
diction. This value lies between zero and one. A high value means that the filter should trust the
measurement. In our experiment, K starts high, then gradually declines. This is caused by the
measurement noise variances that get larger with the total displacement.
The state covariance, P, gets larger as the process noise, Q, is added every time a prediction is
made. A Kalman gain larger than zero results in a reduction in the state covariance. So, the Kalman
gain is responsible for keeping the state covariance from getting too high.
Also the stability of the residuals is very important for the stability of the filter. In our case, the
residuals are fairly stable.
The second test showed that an ultrasonic distance sensor can be used for edge detection to
further correct the position estimation. This technique is powerful if the starting position of the robot
and the solar panels dimensions are known.
This sensor fusion method produces an estimate that is more accurate than the estimates from
the encoders and the optical flow sensor alone. Further tuning and addition of absolute measuring
techniques should make this better.
51
7 CONCLUSION
The research question of this thesis was as follows: ”How to improve position estimation of a fully
autonomous solar panel cleaning robot equipped with wheel encoders and an IMU?” Available solar
panel cleaning mechanisms were discussed. Existing measurement techniques were investigated
to choose the suitable ones for this application. The final measurement setup had to be immune to
wheel slippage and functional on a reflective surface on open air. The desired accuracy was within
a few centimeters. A fused setup between visual odometry, inertial navigation, and wheel odometry
was chosen to be developed further. Visual odometry algorithms were explored using the OpenCv
library in Python.
Few of the prominent sensor fusion techniques were described. A visual odometry sensor was
selected and tested, to be fused with an IMU and wheel encoders. A final test setup was designed
using the turtlebot. A tracking algorithm was developed to validate the measurements.
Attempts were made to apply the Extended Kalman filter, however the results were inaccurate.
Finally, the one-dimensional linear Kalman filter was tested which delivered promising results for
the displacement estimation. A second linear Kalman filter was applied to derive a good estimation
of the orientation. The output from these two Kalman filters were used in the robot’s motion model
to derive a position estimate. This estimate was further fused with the position estimate from edge
detection using a two dimensional Kalman filter. Comparisons were made between trajectories
estimated by the velocity command model, wheel encoders, wheel encoders combined with an
IMU, optical flow sensor combined with an IMU, and all of these fused together. The fused trajectory
was consistently better than the others. On average the fused trajectory performed four times better
than the estimation solely relying on wheel encoders and IMU.
To answer the research question: The position estimation of a fully autonomous solar panel clean-
ing robot equipped with wheel encoders and an IMU can be improved by fusing these two with
an optical flow sensor. However, the optical flow sensor remains a relative measurement tech-
nique which accumulates error over time and requires consistent lighting conditions. For optimal
operation, the use of absolute measuring techniques is recommended.
It should be noted that the vehicle used for the experiments is not designed to manoeuvre on a
solar panel. There is a significant amount of wheel slippage caused by the glossy surface. A wheel
design that functions on a glass surface should deliver better results. Furthermore, due to lack
of space on the turtlebot, the optical flow sensor could not be mounted on the center of rotation
causing errors when the robot made turns. This work can be further improved by implementing
an absolute measurement technique. If the drone could somehow deliver landmarks on the solar
panel, use of lidar technology could provide significant improvements. Testing this on a prototype
with wheels designed for a solar panel should also deliver better results. Developing algorithms to
determine the solar panel dimensions, and the initial position of the cleaning robot using the drone
is also crucial to use edge detection for position estimation.
52
BIBLIOGRAPHY
[4] SolarCleano, “B1a solarbridge - the future of solar farm cleaning,” https://solarcleano.com/en/
products/solarbridge-autonomous-solar-panel-cleaning-robot-b1a, 2022.
[6] Federal Aviation Administration, “Satellite navigation - gps - how it works,” https:
//www.faa.gov/about/office org/headquarters offices/ato/service units/techops/navservices/
gnss/gps/howitworks, accessed Apr. 19, 2022.
[9] S. G. Tzafestas, “12 - mobile robot localization and mapping,” in Introduction to Mobile Robot
Control, S. G. Tzafestas, Ed. Oxford: Elsevier, 2014, pp. 479–531. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/B9780124170490000122
[11] K. Sooknanan, D. Corrigan, G. Baugh, N. Harte, and J. Wilson, “Indexing and selection of
well-lit details in underwater video mosaics using vignetting estimation,” 05 2012, pp. 1–7.
[13] datahackers.rs, “014 calculating sparse optical flow using lucas kanade method,” https:
//datahacker.rs/calculating-sparse-optical-flow-using-lucas-kanade-method/, (accessed Apr.
22, 2022).
53
[16] R. R. L. Jr, “Kalman and bayesian filters in python,” https://github.com/rlabbe/
Kalman-and-Bayesian-Filters-in-Python, 2020.
[17] E. Wan and R. Merwe, “The unscented kalman filter for nonlinear estimation,” vol. 153-158, 02
2000, pp. 153 – 158.
[18] M. A. Green, “Solar cell efficiency tables (version 45),” Progress in Photovoltaics: Research
and Applications, vol. 23, pp. 1–9, 2022.
[19] R. Appels, “Effect of soiling on photovoltaic modules,” Solar Energy, vol. 96, pp. 283–291,
2013.
[23] Federal Aviation Administration, “Satellite navigation - gps - how it works,” https://www.gps.
gov/systems/gnss/, (accessed Apr. 18, 2022).
[26] F. Petit, “What is lidar? an overview of lidar sensor, its types, and advantages,” https://www.
blickfeld.com/blog/what-is-lidar-sensor/, 2020, (accessed Apr. 20, 2022).
[28] A. Barzegar, O. Doukhi, and D. Lee, “Design and implementation of an autonomous electric
vehicle for self-driving control under gnss-denied environments,” Applied Sciences, vol. 11, 04
2021.
[29] Novatel, “Apn-064 imu errors and their effects,” https://www.gnss.ca/app notes/APN-064 IMU
Errors and Their Effects.html, 2014, (accessed Apr. 19, 2022).
[31] D. Scaramuzza and F. Fraundorfer, “Visual odometry [tutorial],” IEEE Robotics Automation
Magazine, vol. 18, no. 4, pp. 80–92, 2011.
54
[34] D. Thailappan, “Feature detection, description and matching of im-
ages using opencv,” https://www.analyticsvidhya.com/blog/2021/06/
feature-detection-description-and-matching-of-images-using-opencv/, (accessed Apr.
22, 2022).
[38] M. B. Alatise and G. P. Hancke, “A review on challenges of autonomous mobile robot and
sensor fusion methods,” IEEE Access, vol. 8, pp. 39 830–39 846, 2020.
[40] V. Fox, J. Hightower, L. Liao, D. Schulz, and G. Borriello, “Bayesian filtering for location esti-
mation,” IEEE Pervasive Computing, vol. 2, no. 3, pp. 24–33, 2003.
[42] R. Schneider and C. Georgakis, “How to not make the extended kalman filter fail,” Industrial
I& Engineering Chemistry Research, 2013.
[49] Elecrow, “Optical flow sensor for apm series flight control board,” https://www.elecrow.
com/optical-flow-sensor-for-apm-series-flight-control-board-p-1379.html, (accessed Apr. 25,
2022).
55
[51] P. Barry and P. Crowley, Modern Embedded Computing. Elsevier Inc., 2012, ch. Chapter 4 -
Embedded Platform Architecture, pp. 41–97.
[54] E. Al Khatib, M. Jaradat, M. Abdel-Hafez, and M. Roigari, “Multiple sensor fusion for mobile
robot localization and navigation using the extended kalman filter,” 12 2015.
56
APPENDICES
E Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-7
E.4 Mean error and mean squared error comparison between estimations and refer-
ence trajectories . . . . . . . . . . . . . . . . . . . . . . . . . E-10
E.5 Initial performance test of the PAA5100JE optical flow sensor . . . . . . . . . E-11
57
APPENDIX A : PAA5100JE AND RASPBERRY PI ZERO W
A.1 PAA5100JE schematics
WHITE
Inputs and outputs are both turned off when VCC is off 33R LED1
-> all data lines are reverse protected 3V3 LED_CATHODE
SSM3K56MFV,L3F
R4
J2
7 5V
U1 WHITE
LED2
6 CS_5V 3V3 5
VCC
5 SCK_5V
4 MOSI_5V CS_5V 1 6 CS
1A 1Y
Q1
3 MISO_5V 3 4
2A 2Y
2 MOTION_5V
1 2
GND
GND
GND
5V
U2
3V3 5 2V0
VCC
PAA5100JE-Q
0.1uF
MOSI_5V 1 6 MOSI C1
C2
SCK_5V 3
1A
2A
1Y
2Y
4 SCK PMW3901MB /
U7
10u PAA5100JE-Q
2
GND VDD: 1.8-2.1V
5V IN OUT 3V3 VDDIO: 1.8-3.6V
2 17 SCK
VDD SCLK
C3
C12
GND MOSI
300mA Low Dropout Regulator 3V3 3 18 MISO
VDDIO MISO
3.3V AP7333-33SRG 10u CS
19 CS
4
VREG
0.1uF
C4
C5 7
RESET MOTION
15 MOTION
0.1uF
GND GND
C6
10u 1*3
LED 6mA max 20
SPI outputs are turned off when CS_5V is high. GND LED LED_CATHODE
U4 GND
3V3 GND
5V 5 2 MISO GND
VCC IN
4 MISO_5V
OUT
1uF
C8
3 1 CS_5V
GND OE
AP7330
VIN: 1.8-5.5V
Vref: 0.8V M74VHC1GT125DF2G
3V3 1 5 2V0
VIN VOUT
U3
C11
1uF
1uF
C7
3 5V 5 2 MOTION
EN VCC IN
3k3
4 MOTION_5V
R2
OUT
2 4 3 1 CS_5V
GND ADJ GND OE
GND
GND
GND M74VHC1GT125DF2G
2k2
R3
GND
A-1
A.2 Raspberry Pi Zero W schematics
5 4 3 2 1
3V3
PP1
5V
J1 U3
6
7
5V 7 8 L1 4.7u
6 VIN1 LX1 PP8
1 EN1
2 R1
D 690-005-298-486 3 100K J5 D
4 1% 5V 4 1V8 RUN 1
5 C1 1005 FB1 RUN VC_TRST_N 2
47u 1 2 VC_TDI 3
PP5 3216 12 VIN2 LX2 PP9 4
PP6 EN2 L2 4.7u VC_TDO 5
8
9
VC_TMS 6
C2 C3 C4 7
100n 10u 10u 10 VC_TCK 8
1005 2012 2012 FB2
5 C7 C8 68710814522
3 NC1 11 10u 10u
9 GND NC2 13 2012 2012
GND PAD
PAM2306AYPKE
1V8
20
21
C9 J12
220n
SH1
SH2
1005 U1D 3V3
P9 R8 XTAL_N 1 J6 1
C PLL_1V8 OSC XTALN X1 HDMI_T2P 2 D2_SH CAM1_DN0 2 C
HDMI_T2N 3 D2+ 690-019-298-903 CAM1_DP0 3 R20
PLL_VDD N7 R9 XTAL_P 1 3 4 D2- 4 470
P7 PLL_VDD_OUT XTALP HDMI_T1P 5 D1_SH CAM1_DN1 5 1%
C12 C13 PLL_VDD HDMI_T1N 6 D1+ Mini HDMI CAM1_DP1 6 1005
1u 220n C10 19.2MHz C11 7 D1- 7
D0_SH
2
SH3
SH4
HDMI_HPD SDA0 21
SDA0
22
3V3
22
23 54548-2271
3V3
B B
R23 R24
1.8K 1.8K
1% 1%
3V3 1005 1005
U1E
H4 H2 ID_SD 3V3 5V
E4 VDDIO2_1 GPIO1 GPIO0 H6 ID_SC
VDDIO2_2 GPIO1 J6 GPIO2
C67 GPIO2 J4 GPIO3 J8
220n GPIO3 H1 GPIO4 1 2
1005 GPIO4 H3 GPIO5 (SDA1) GPIO2 3 4
GPIO5 G1 GPIO6 (SCL1) GPIO3 5 6
GPIO6 F2 GPIO7 (GPIO_GCLK) GPIO4 7 8 GPIO14 (TXD0)
GPIO7 F1 GPIO8 9 10 GPIO15 (RXD0)
GPIO8 G2 GPIO9 (GPIO_GEN0) GPIO17 11 12 GPIO18 (GPIO_GEN1)
GPIO9 G3 GPIO10 (GPIO_GEN2) GPIO27 13 14
GPIO10 G4 GPIO11 (GPIO_GEN3) GPIO22 15 16 GPIO23 (GPIO_GEN4)
GPIO11 H7 GPIO12 17 18 GPIO24 (GPIO_GEN5)
GPIO12 (SPI_MOSI)
ID_SD and ID_SC PINS:
G5 GPIO13 GPIO10 19 20
GPIO13 D4 GPIO14 (SPI_MISO) GPIO9 21 22 GPIO25 (GPIO_GEN6) These pins are reserved for ID EEPROM.
GPIO14 E2 GPIO15 (SPI_SCLK) GPIO11 23 24 GPIO8 (SPI_CE0_N)
GPIO15 25 26 GPIO7 (SPI_CE1_N) At boot time this I2C interface will be
C1 GPIO16 ID_SD 27 28 ID_SC interrogated to look for an EEPROM
GPIO16 E1 GPIO17 GPIO5 29 30 that identifes the attached board and
GPIO17 B4 GPIO18 GPIO6 31 32 GPIO12 allows automagic setup of the GPIOs
GPIO18 (and optionally, Linux drivers).
G8 GPIO19 GPIO13 33 34
GPIO19 E5 GPIO20 GPIO19 35 36 GPIO16
GPIO20 C4 GPIO21 GPIO26 37 38 GPIO20 DO NOT USE these pins for anything other
A GPIO21 A
B3 GPIO22 39 40 GPIO21 than attaching an I2C ID EEPROM. Leave
GPIO22 C5 GPIO23 unconnected if ID EEPROM not required.
GPIO23 A4 GPIO24
GPIO24 A5 GPIO25
DNF © Raspberry Pi 2015
GPIO25 D6 GPIO26
GPIO26 B5 GPIO27 www.raspberrypi.org
GPIO27
Title Drawn By
BCM2835
DNI Raspberry Pi ZeroW Roger Thornton
Date: Sheet 1 of 1
5 4 3 2 1
A-2
A.3 Raspberry Pi 3 B+ schematics
5 4 3 2 1
PP2
5V
6
7
5V
F1 R69
J1 1 1 2 5V_SYS
2 PP7
3 C196 C197
POWER IN MF-MSMF250/X 10R
1
4 D7 10u 1u
10103594-0001LF
2
PP6
D D
VIN
R81 NF
Bridged via trace on PCB
C203
10R 47u 3V3A
1V8 1005 3216 U8
1% 24 8 J10
AGND 31 5V_SYS 3V3_LDO C189
5V_SYS 3V3 1u TR0p 1
5V 7 1005 1
U5N VIN1 VIN2 VIN3 VIN4 28 VIN 25 PP8
C180 BCM2837B 12 VIN1 VOUT1 C70
220n 29 VIN2 27 LX1 L7 22u TR0n 2 2
1005 C187 C202 C188 C193 C195 C186 C194 13 VIN3 LX1 1608 TR1p 3
T5 22u 100n 22u 22u 100n 100n 22u VIN4 2R2 3
PLL_1V8 U8 XTAL_N 1608 1005 1608 1608 1005 1005 1608 3V3
XTALN 1
C75 VDDIO 1V8 TR1n 6 6
PLL_VDD U5 1u C338 100n 4
W6 PLL_VDD_OUT 1005 9 PP9 1005 5
V6 PLL_VDD VOUT2 C78 TR2p 7
XOSC_VDD X2 11 LX2 L14 22u 4
C183 C175 C174 2 LX2 1608
INT_SDA SDA 2R2
1u 220n 220n F18 V8 XTAL_P 1 3 3
AGND_13
AGND_14
18p 18p
PP15 1005 1005 21 32 1V1
AN1 VOUT3
W16
W18
PGND1
PGND2
PGND4
PGND4
PG1 LX4 470n
AGND
RUN PP30 20 16 3216 3216
PG2 AGND LX4 19 SHELL
20 SHELL
XR77004 J14
26
10
17
18
23
5V A70-112-331-N126
TR2_TAP 4 2 TR0_TAP
TR3_TAP 3 1 TR1_TAP
R68
2W PIN HEADER
3.3K
5V 3V3
DISPLAY R61
CAMERA 1.8K
J4 1% R70 R72
J3 1005 1K 470R
R16
C83 A/V 1% 1%
1 30 AUDIO_L 1005 1005
1 30 DSI1_DN1 2 29 U11 J7
CAM1_DN0 2 29 DSI1_DP1 3 28 3V3A 220R R17 2 PP10 PP13
CAM1_DP0 3 28 4 27 1005 C59 100R 47u PP25
4 27 DSI1_CN 5 26 AUD_PWM1 1 A1 Y1 6 1% 100n 1% 3216 4
CAM1_DN1 5 26 DSI1_CP 6 25 1005 1005 5 D6 D5
STATUS
CAM1_DP1 6 25 7 24 2 5 RED
POWER OK GREEN "ACT" LED
GND VCC
7 24 DSI1_DN0 8 23 3 1611 "PWR" LED 1611
CAM1_CN 8 23 DSI1_DP0 9 22 AUD_PWM0 3 4 Q4
CAM1_CP 9 22 10 21 A2 Y2 PP26 1 DMG1012T
C80 STATUS_LED_R
10 21 SCL0 11 20 R18
11 20 SDA0 12 19 AUDIO_R STX-35017-4N R71
CAM_GPIO0
CAM_GPIO1 12 19 13 18 NC7WZ16 C61 10K Q5
B SCL0 13 18 14 17 100n 220R R19 R60 3V3 1% DMG1012T B
SCL0 47u D4 STATUS_LED_G
SDA0 SDA0 14 17 3V3 15 16 1005 1005 C62 100R 1.8K 1005
15 16 1% 100n 1% 3216 1% 2
3V3
1005 1005 1005 COMPVID 3
1
1-1734248-5
1-1734248-5
BAV99
20
21
HDMI_T2P 1
2 J6
HDMI_T2N 3
H5V HDMI_T1P 4 47151-1051 3V3 5V
5V U9 5
PP20 HDMI_T1N 6 3V3
1 2 HDMI_T0P 7
IN OUT 8 C64 C65
C54 C56 HDMI_T0N 9 3V3 100n 100n
100n 3 100n HDMI_CKP 10 C67 1005 1005
1005 GND 1005 11 U1C 220n
HDMI_CKN 12 BCM2837B 1005 J8
G4
HDMI
J5
GPIO5 L2 H2 GPIO19 (SPI_MISO) GPIO9 21 22 GPIO25 (GPIO_GEN6) These pins are reserved for HAT ID EEPROM.
GPIO6 M1 GPIO5 GPIO19 F1 GPIO20 (SPI_SCLK) GPIO11 23 24 GPIO8 (SPI_CE0_N)
A GPIO7 J4 GPIO6 GPIO20 G3 GPIO21 25 26 GPIO7 (SPI_CE1_N) At boot time this I2C interface will be A
GPIO8 K1 GPIO7 GPIO21 F2 GPIO22 ID_SD 27 28 ID_SC interrogated to look for an EEPROM
GPIO9 B10 GPIO8 GPIO22 F3 GPIO23 GPIO5 29 30 that identifes the attached board and
GPIO10 K2 GPIO9 GPIO23 D1 GPIO24 GPIO6 31 32 GPIO12 allows automagic setup of the GPIOs
GPIO11 K3 GPIO10 GPIO24 D2 GPIO25 GPIO13 33 34
(and optionally, Linux drivers).
GPIO12 B9 GPIO11 GPIO25 B5 GPIO26 GPIO19 35 36 GPIO16
GPIO13 H3 GPIO12 GPIO26 C5 GPIO27 GPIO26 37 38 GPIO20 DO NOT USE these pins for anything other
GPIO13 GPIO27 39 40 GPIO21 than attaching an I2C ID EEPROM. Leave
unconnected if ID EEPROM not required.
40W 0.1" PIN HDR © Raspberry Pi 2018
www.raspberrypi.org
Title Drawn By
GPIO EXPANSION Raspberry Pi 3 Model B+ (Reduced Schematic) Roger Thornton
A-3
APPENDIX B : OPTICAL FLOW ALGORITHMS IN OPENCV
B.1 Sparse optical flow algorithm
B-4
APPENDIX C : TRACKING ALGORITHM IN OPENCV
C.1 Calibration algorithm
C-5
APPENDIX D : LINEAR KALMAN FILTER
The algorithm can be found on: https://kuleuven-my.sharepoint.com/:f:/g/personal/
arne_maja_student_kuleuven_be/Ev_aCivJEDlJnqcWRAqttAwBFvBrFNVyfngTrBwTXqmcWA?
e=wZwkNx
D-6
APPENDIX E : TESTS
E.1 Covariance determination of the measurements
E-7
E.2 Noise covariances of the orientation
Figure E.2: Finding noise covariances of the orientations derived from the wheel encoders, the IMU, and
the angular velocity command.
E-8
Figure E.4: Test 2
E-9
Figure E.6: Test 4
E.4 Mean error and mean squared error comparison between estimations and reference
trajectories
Figure E.7: ME and MSE for the four plots given above
E-10
E.5 Initial performance test of the PAA5100JE optical flow sensor
E-11
FACULTY OF ENGINEERING TECHNOLOGY
CAMPUS GROUP T LEUVEN
Andreas Vesaliusstraat 13
3000 LEUVEN, Belgium
tel. +32 16 30 10 30
fax +32 16 30 10 40
[email protected]
www.iiw.kuleuven.be