Low-Level Control of a Quadrotor With Deep

This document discusses the development of a low-level controller for a Crazyflie quadrotor using model-based reinforcement learning (MBRL), which allows for rapid generation of controllers without prior dynamics knowledge. The MBRL approach demonstrated the ability to achieve stable hovering for up to 6 seconds with only 3 minutes of training data, showcasing its potential for broader applications in robotics. The research highlights the advantages of MBRL over traditional methods, such as PID tuning, while also acknowledging its current limitations in performance and computational requirements.

Uploaded by

lqiang645

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views7 pages

Low-Level Control of a Quadrotor With Deep

Uploaded by

lqiang645

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

4224 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 4, NO.

4, OCTOBER 2019

Low-Level Control of a Quadrotor With Deep

Model-Based Reinforcement Learning
Nathan O. Lambert , Daniel S. Drew , Joseph Yaconelli, Sergey Levine,
Roberto Calandra , and Kristofer S. J. Pister

Abstract—Designing effective low-level robot controllers of-

ten entail platform-specific implementations that require manual
heuristic parameter tuning, significant system knowledge, or long
design times. With the rising number of robotic and mechatronic
systems deployed across areas ranging from industrial automation
to intelligent toys, the need for a general approach to generat-
ing low-level controllers is increasing. To address the challenge
of rapidly generating low-level controllers, we argue for using
model-based reinforcement learning (MBRL) trained on relatively
small amounts of automatically generated (i.e., without system
simulation) data. In this letter, we explore the capabilities of MBRL
on a Crazyflie centimeter-scale quadrotor with rapid dynamics
to predict and control at ≤50 Hz. To our knowledge, this is the
first use of MBRL for controlled hover of a quadrotor using only
on-board sensors, direct motor input signals, and no initial dy-
namics knowledge. Our controller leverages rapid simulation of a
neural network forward dynamics model on a graphic processing
unit enabled base station, which then transmits the best current
action to the quadrotor firmware via radio. In our experiments,
Fig. 1. The model predictive control loop used to stabilize the Crazyflie. Using
the quadrotor achieved hovering capability of up to 6 s with 3 min deep model-based reinforcement learning, the quadrotor reaches stable hovering
of experimental training data. with only 10,000 trained datapoints – equivalent to 3 minutes of flight.
Index Terms—Deep learning in robotics and automation, aerial
systems: mechanics and control.
reliance on expert-based controller design, in this letter we
I. INTRODUCTION investigate the question: Is it possible to autonomously learn
competitive low-level controllers for a robot, without simulation
HE ideal method for generating a robot controller would
T be extremely data efficient, free of requirements on domain
knowledge, and safe to run. Current strategies to derive low-level
or demonstration, in a limited amount of time?
To answer this question, we turn to model-based reinforce-
ment learning (MBRL) – a compelling approach to synthesize
controllers are effective across many platforms, but system
controllers even for systems without analytic dynamics models
identification often requires substantial setup and experiment
and with high cost per experiment [1]. MBRL has been shown
time while PID tuning requires some domain knowledge and
to operate in a data-efficient manner to control robotic systems
still results in dangerous roll-outs. With the goal to reduce
by iteratively learning a dynamics model and subsequently
leveraging it to design controllers [2]. Our contribution builds
Manuscript received February 24, 2019; accepted July 7, 2019. Date of
publication July 23, 2019; date of current version August 15, 2019. The work of on simulated results of MBRL [3]. We employ the quadrotor as
J. Yaconelli was supported by the Berkeley Sensors & Actuator Center SUPERB a testing platform to broadly investigate controller generation on
REU Program. This letter was recommended for publication by Associate Editor a highly nonlinear, challenging system, not to directly compare
R. Triebel and Editor T. Asfour upon evaluation of the reviewers’ comments.
(Corresponding author: Nathan O. Lambert.) performance versus existing controllers. This letter is the first
N. O. Lambert, D. S. Drew, S. Levine, and K. S. J. Pister are with demonstration of controlling a quadrotor with direct motor
the Department of Electrical Engineering and Computer Sciences, Uni- assignments sent from a MBRL derived controller learning
versity of California–Berkeley, Berkeley, CA 94720 USA (e-mail: nol@
berkeley.edu; [email protected]; [email protected]; pister@ only via experience. Our work differs from recent progress in
eecs.berkeley.edu). MBRL with quadrotors by exclusively using experimental data
J. Yaconelli is with the University of Oregon, Eugene, OR 97403-1202 USA and focusing on low level control, while related applications
(e-mail: [email protected]).
R. Calandra is with the Facebook AI Research, Menlo Park, CA 94025 USA of learning with quadrotors employ low-level control gener-
(e-mail: [email protected]). ated in simulation [4] or use a dynamics model learned via
This letter has supplementary downloadable material available at experience to command on-board controllers [5]. Our MBRL
http://ieeexplore.ieee.org, provided by the authors. The video shows early and
final results, with a brief discussion of future work. solution, outlined in Figure 1, employs neural networks (NN)
Digital Object Identifier 10.1109/LRA.2019.2930489 to learn a forwards dynamics model coupled with a ‘random
2377-3766 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Authorized licensed use limited to: NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY. Downloaded on March 17,2025 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
LAMBERT et al.: LOW-LEVEL CONTROL OF A QUADROTOR WITH DEEP MODEL-BASED REINFORCEMENT LEARNING 4225

shooter’ MPC, which can be efficiently parallelized on a graphic conditioned on previously existing internal controllers. The gen-
processing unit (GPU) to execute low-level, real-time control. eral nature of our model from sensors to actuators demonstrates
Using MBRL, we demonstrate controlled hover of a Crazyflie the potential for use on robots with no previous controller — we
via on-board sensor measurements and application of pulse only use the quadrotor as the basis for comparison and do not
width modulation (PWM) motor voltage signals. Our method expect it to be the limits of the MBRL system’s functionality.
for quickly learning controllers from real-world data is not yet
an alternative to traditional controllers such as PID, but it opens B. Learning for Quadrotors
important avenues of research. The general mapping of the for-
Although learning-based approaches have been widely ap-
ward dynamics model, in theory, allows the model to be used for
plied for trajectory control of quadrotors, implementations typ-
control tasks beyond attitude control. Additionally, we highlight
ically rely on sending controller outputs as setpoints to stable
the capability of leveraging the predictive models learned on
on-board attitude and thrust controllers. Iterative learning con-
extremely little data for working at frequencies ≤50 Hz, while
trol (ILC) approaches [14], [15] have demonstrated robust con-
a hand tuned PID controller at this frequency failed to hover
trol of quadrotor flight trajectories but require these on-board
the Crazyflie. With the benefits outlined, the current MBRL
controllers for attitude setpoints. Learning-based model predic-
approach has limitations in performance and applicability to
tive control implementations, which successfully track trajecto-
our goal of use with other robots. The performance in this letter
ries, also wrap their control around on-board attitude controllers
has notable room for improvement by mitigating drift. Future
by directly sending Euler angle or thrust commands [16], [17].
applications are limited by our approach’s requirement of a
Gaussian process-based automatic tuning of position controller
high power external GPU – a prohibitively large computational
gains has been demonstrated [18], but only in parallel with
footprint when compared to standard low-level controllers – and
on-board controllers tuned separately.
by the method’s potential for collisions when learning.
Model-free reinforcement learning has been shown to gen-
The resulting system achieves repeated stable hover of up to
erate control policies for quadrotors that out-performs linear
6 seconds, with failures due to drift of unobserved states, within
MPC [4]. Although similarly motivated by a desire to generate a
3 minutes of fully-autonomous training data. These results
control policy acting directly on actuator inputs, the work used
demonstrate the ability of MBRL to control robotic systems in
an external vision system for state error correction, operated
the absence of a priori knowledge of dynamics, pre-configured
with an internal motor speed controller enabled (i.e., thrusts
internal controllers for stability or actuator response smoothing,
were commanded and not motor voltages), and generated a large
and expert demonstration.
fraction of its data in simulation.
Researchers of system identification for quadrotors also apply
II. RELATED WORK machine learning techniques. Bansal et al. used NN models of
A. Attitude and Hover Control of Quadrotors the Crazyflie’s dynamics to plan trajectories [5]. Our imple-
mentation differs by directly predicting change in attitude with
Classical controllers (e.g., PID, LQR, iLQR) in conjunction
on-board IMU measurements and motor voltages, rather than
with analytic models for the rigid body dynamics of a quadrotor
predicting with global, motion-capture state measurements and
are often sufficient to control vehicle attitude [6]. In addition,
thrust targets for the internal PIDs. Using Bayesian Optimization
linearized models are sufficient to simultaneously control for
to learn a linearized quadrotor dynamics model demonstrated
global trajectory attitude setpoints using well-tuned nested PID
capabilities for tuning of an optimal control scheme [19]. While
controllers [7]. Standard control approaches show impressive
this approach is data-efficient and is shown to outperform ana-
acrobatic performance with quadrotors, but we note that we
lytic models, the model learned is task-dependent. Our MBRL
are not interested in comparing our approach to finely-tuned
approach is task agnostic by only requiring a change in objective
performance; the goal of using MBRL in this context is to
function and no new dynamics data for a new function.
highlight a solution that automatically generates a functional
controller in less or equal time than initial PID hand-tuning,
with no foundation of dynamics knowledge. C. Model-Based Reinforcement Learning
Research focusing on developing novel low-level attitude Functionality of MBRL is evident in simulation for multi-
controllers shows functionality in extreme nonlinear cases, such ple tasks in low data regimes, including quadrupeds [20] and
as for quadrotors with a missing propeller [8], with multiple manipulation tasks [21]. Low-level MBRL control (i.e., with
damaged propellers [9], or with the capability to dynamically direct motor input signals) of an RC car has been demonstrated
tilt its propellers [10]. Optimal control schemes have demon- experimentally, but the system is of lower dimensionality and
strated results on standard quadrotors with extreme precision has static stability [22]. Relatively low-level control (i.e., mostly
and robustness [11]. thrust commands only passed through an internal governor
Our work differs by specifically demonstrating the possibility before conversion to motor signals) of an autonomous helicopter
of attitude control via real-time external MPC. Unlike other has been demonstrated, but required a ground-based vision
work on real-time MPC for quadrotors which focus on trajectory system for error correction in state estimates as well as expert
control [12], [13], ours uses a dynamics model derived fully demonstration for model training [22].
from in-flight data that takes motors signals as direct inputs. Properly optimized NNs trained on experimental data
Effectively, our model encompasses only the actual dynamics show test error below common analytic dynamics models for
of the system, while other implementations learn the dynamics flying vehicles, but the models did not include direct actuator
Authorized licensed use limited to: NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY. Downloaded on March 17,2025 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
4226 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 4, NO. 4, OCTOBER 2019

signals and did not include experimental validation through

controller implementation [23]. A model predictive path
integral (MPPI) controller using a learned NN demonstrated
data-efficient trajectory control of a quadrotor, but results
were only shown in simulation and required the network to be
initialized with 30 minutes of demonstration data with on-board
controllers [2].
MBRL with trajectory sampling for control outperforms, in
terms of samples needed for convergence, the asymptotic perfor-
mance of recent model free algorithms in low dimensional tasks
[3]. Our work builds on strategies presented, with most influence
derived from “probabilistic” NNs, to demonstrate functionality
in an experimental setting — i.e., in the presence of real-world
higher order effects, variability, and time constraints.
NN-based dynamics models with MPC have functioned for
experimental control of an under-actuated hexapod [24]. The Fig. 2. The ROS computer passes control signals and state data between the
hexapod platform does not have the same requirements on MPC node and the Crazyflie ROS server. The Crazyflie ROS server packages
Tx PWM values to send and unpacks Rx compressed log data from the robot.
frequency or control error due to its static stability, and incor-
porates a GPS unit for relatively low-noise state measurements.
Our work has a similar architecture, but has improvements in Crazyflie to ask for logging data in the returning acknowl-
the network model and model predictive controller to allow edgment packet; without decreasing the logging payload and
substantially higher control frequencies with noisy state data. rate we could not simultaneously transmit PWM commands at
By demonstrating functionality without global positioning data, the desired frequency due to radio communication constraints.
the procedure can be extended to more robot platforms where We created a new internal logging block of compressed IMU
only internal state and actuator commands are available to create data and Euler angle measurements to decrease the required
a dynamics model and control policy. bitrate for logging state information, trading state measurement
precision for update frequency. Action commands and logged
state data are communicated asynchronously; the ROS server
III. EXPERIMENTAL SETUP control loop has a frequency set by the ROS rate command, while
In this letter, we use as experimental hardware platform the state data is logged based on a separate ROS topic frequency. To
open-source Crazyflie 2.0 quadrotor [25]. The Crazyflie is 27 g verify control frequency and reconstruct state action pairs during
and 9 cm2 , so the rapid system dynamics create a need for a autonomous rollouts we use a round-trip packet ID system.
robust controller; by default, the internal PID controller used for
attitude control runs at 500 Hz, with Euler angle state estimation IV. LEARNING FORWARD DYNAMICS
updates at 1 kHz. This section specifies the ROS base-station and
The foundation of a controller in MBRL is a reliable forward
the firmware modifications required for external stability control
dynamics model for predictions. In this letter, we refer to the
of the Crazyflie.
current state and action as st and at , which evolve according
All components we used are based on publicly available
to the dynamics f (st , at ). Generating a dynamics model for the
and open source projects. We used the Crazyflie ROS interface
robot often consists of training a NN to fit a parametric function
supported here: https://github.com/whoenig/crazyflie_ros [26].
fθ to predict the next state of the robot as a discrete change
This interface allows for easy modification of the radio com-
in state st+1 = st + fθ (st , at ). In training, using a probabilistic
munication and employment of the learning framework. Our
loss function with a penalty term on the variance of estimates, as
ROS structure is simple, with a Crazyflie subscribing to PWM
shown in Equation (1), better clusters predictions for more stable
values generated by a controller node, which processes radio
predictions across multiple time-steps [3]. The probabilistic
packets sent from the quadrotor in order to pass state variables
loss fits a Gaussian distribution to each output of the network,
to the model predictive controller (as shown in Figure 2). The
represented in total by a mean vector μθ and a covariance
Crazyradio PA USB radio is used to send commands from the
matrix Σθ
ROS server; software settings in the included client increase
the maximum data transmission bitrate up to 2 Mbps and a N

Crazyflie firmware modification improves the maximum traffic l= [μθ (sn , an ) − sn+1 ]T Σ−1
θ (sn , an )[μθ (sn , an ) − sn+1 ]
rate from 100 Hz to 400 Hz. n=1
In packaged radio transmissions from the ROS server we + log det Σθ (sn , an ) . (1)
define actions directly as the pulse-width modulation (PWM)
signals sent to the motors. To assign these PWM values directly The probabilistic loss function assists model convergence and
to the motors we bypass the controller updates in the standard the variance penalty helps maintain stable predictions on longer
Crazyflie firmware by changing the motor power distribution time horizons. Our networks implemented in Pytorch train
whenever a CRTP Commander packet is received (see Figure 2). with the Adam optimizer [27] for 60 epochs with a learning rate
The Crazyflie ROS package sends empty ping packets to the of .0005 and a batch size of 32. Figure 3 summarizes the network
Authorized licensed use limited to: NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY. Downloaded on March 17,2025 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
LAMBERT et al.: LOW-LEVEL CONTROL OF A QUADROTOR WITH DEEP MODEL-BASED REINFORCEMENT LEARNING 4227

Fig. 3. The NN dynamics model predicts the mean and variance of the change
in state given the past 4 state-action pairs. We use 2 hidden layers of width 250
neurons.

Fig. 4. Predicted states for N = 50 candidate actions with the chosen “best
design. All layers except for the output layer use the Swish action” highlighted in red. The predicted state evolution is expected to diverge
from the ground truth for future t because actions are re-planned at every step.
activation function [28] with parameter β = 1. The network
structure was cross validated offline for prediction accuracy
verses potential control frequency. Initial validation of training from PWM 0 to max or from max to 0 is on the order of 250 ms, so
parameters was done on early experiments, and the final values our update time-step of 20 ms is short enough for motor spin-up
are held constant for each rollout in the experiments reported in to contribute to learned dynamics. To account for spin-up, we
Section VI. The validation set is a random subset of measured append past system information to the current state and PWMs
(st , at , st+1 ) tuples in the pruned data. to generate an input into the NN model that includes past time.
Additional dynamics model accuracy could be gained with From the exponential step response and with a bounded possible
systematic model verification between rollouts, but experimen- PWM value within peq ± 5000, the motors need approximately
tal variation in the current setup would limit empirical insight and 25 ms to reach the desired rotor speed; when operating at 50 Hz,
a lower model loss does not guarantee improved flight time. Our the time step between updates is 20 ms, leading us to an appended
initial experiments indicate improved flight performance with states and PWMs history of length 4. This state action history
forward dynamics models minimizing the mean and variance length was validated as having the lowest test error on our
of state predictions versus models minimizing mean squared data-set (lengths 1 to 10 evaluated). This yields the final input
prediction error, but more experiments are needed to state clear of length 52 to our NN, ξ, with states and actions combined to
relationships between more model parameters and flight perfor- T
ξt = st st−1 st−2 st−3 at at−1 at−2 at−3 .
mance.
Training a probabilistic NN to approximate the dynamics
model requires pruning of logged data (e.g. dropped packets) V. LOW LEVEL MODEL-BASED CONTROL
and scaling of variables to assist model convergence. Our state st
is the vector of Euler angles (yaw, pitch, and roll), linear accel- This section explains how we incorporate our learned forward
erations, and angular accelerations, reading dynamics model into a functional controller. The dynamics
model is used for control by predicting the state evolution given a
T certain action, and the MPC provides a framework for evaluating
st = ω̇x , ω̇y , ω̇z , φ, θ, ψ, ẍ, ÿ, z̈ . (2)
many action candidates simultaneously. We employ a ‘random
The Euler angles are from the an internal complementary shooter’ MPC, where a set of N randomly generated actions
filter, while the linear and angular accelerations are measured are simulated over a time horizon T . The best action is decided
directly from the on-board MPU-9250 9-axis IMU. In practice, by a user designed objective function that takes in the simulated
for predicting across longer time horizons, modeling accelera- trajectories X̂(a, st ) and returns a best action, a∗ , as visualized in
tion values as a global next state rather than a change in state Figure 4. The objective function minimizes the receding horizon
increased the length of time horizon in composed predictions cost of each state from the end of the prediction window to the
before the models diverged. While the change in Euler angle current measurement.
predictions are stable, the change in raw accelerations vary The candidate actions, {ai = (ai,1 , ai,2 , ai,3 , ai,4 )}N i=1 , are
widely with sensor noise and cause non-physical dynamics 4-tuples of motor PWM values centered around the stable hover-
predictions, so all the linear and angular accelerations are trained point for the Crazyflie. The candidate actions are constant across
to fit the global next state. the prediction time horizon T . For a single sample ai , each
We combine the state data with the four PWM values, at = ai,j is chosen from a uniform random variable on the interval
[m1 , m2 , m3 , m4 ]T , to get the system information at time t. [peq,j − σ, peq,j + σ], where peq,j is the equilibrium PWM value
The NNs are cross-validated to confirm using all state data for motor j. The range of the uniform distribution is controlled
(i.e., including the relatively noisy raw measurements) improves by the tuned parameter σ; this has the effect of restricting the
prediction accuracy in the change in state. variety of actions the Crazyflie can take. For the given range of
While the dynamics for a quadrotor are often represented as PWM values for each motor, [peq − σ, peq + σ], we discretize
a linear system, for a Micro Air Vehicle (MAV) at high control the candidate PWM values to a step size of 256 to match the
frequencies motor step response and thrust asymmetry heavily future compression into a radio packet. This discretization of
impact the change in state, resulting in a heavily nonlinear available action choices increases the coverage of the candidate
dynamics model. The step response of a Crazyflie motor RPM action space. The compression of PWM resolution, while helpful

Authorized licensed use limited to: NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY. Downloaded on March 17,2025 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
4228 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 4, NO. 4, OCTOBER 2019

for sampling and communication, represents an uncharacterized

detriment to performance.
Our investigation focuses on controlled hovering, but other
tasks could be commanded with a simple change to the ob-
jective function. The objective we designed for stability seeks
to minimize pitch and roll, while adding additional cost terms
to Euler angle rates. In the cost function, λ effects the ratio
between proportional and derivative gains. Adding cost terms to
predicted accelerations did not improve performance because of
the variance of the predictions. Fig. 5. Mean and standard deviation of the 10 flights during each rollout
learning at 25 Hz and 50 Hz. The 50 Hz shows a slight edge on final performance,
T
but a much quicker learning ability per flight by having more action changes
a∗ = arg min λ(ψt2 + θt2 ) + ψ̇t2 + θ̇t2 + φ̇2t . (3) during control.
a
t=1

Our MPC operates on a time horizon T = 12 to leverage the The drift showcases the challenge of using attitude controllers
predictive power of our model. Higher control frequencies can to mitigate an offset in velocity.
run at a cost of prediction horizon, such as T = 9 at 75 Hz or
T = 6 at 100 Hz. The computational cost is proportional to the B. Learning Process
product of model size, number of actions (N ), and time horizon
The learning process follows the RL framework of collecting
(T ). At high frequencies the time spanned by the dynamics
data and iteratively updating the policy. We trained an initial
model predictions shrinks because of a smaller dynamics step in
model f0 on 124 and 394 points of dynamics data at 25 Hz
prediction and by having less computation for longer T , limiting
and 50 Hz, respectively, from the Crazyflie being flown by a
performance. At 50 Hz, a time horizon of 12 corresponds to a
random action controller. Starting with this initial model as the
prediction of 240 ms into the future. Tuning the parameters of
MPC plant, the Crazyflie undertakes a series of autonomous
this methodology corresponds to changes in the likelihood of
flights from the ground with a 250 ms ramp up, open-loop takeoff
taking the best action, rather than modifying actuator responses,
followed by on-policy control while logging data via radio. Each
and therefore its effect on performance is less sensitive than
roll-out is a series of 10 flights, which causes large variances
changes to PID or standard controller parameters. At 50 Hz, the
in flight time. The initial roll-outs have less control authority
predictive power is strong, but the relatively low control frequen-
and inherently explore more extreme attitude orientations (often
cies increases susceptibility to disturbances in between control
during crashes), which is valuable to future iterations that wish
updates. A system running with an Nvidia Titan Xp attains a
to recover from higher pitch and/or roll. The random and first
maximum control frequency of 230 Hz with N = 5000, T = 1.
three controlled roll-outs at 50 Hz are plotted in Figure 6 to show
For testing we use locked frequencies of 25 Hz and 50 Hz at
the rapid improvement of performance with little training data.
N = 5000, T = 12.
The full learning curves are shown in Figure 5. At both 25 Hz
and 50 Hz the rate of flight improvement reaches its maximum
VI. EXPERIMENTAL EVALUATION once there is 1,000 trainable points for the dynamics model,
We now describe the setting used in our experiments, the which takes longer to collect at the lower control frequency. The
learning process of the system, and the performance summary improvement is after roll-out 1 at 50 Hz and roll-out 5 at 25 Hz.
of the control algorithm. Videos of the flying quadrotor, and The longest individual flights at both control frequencies is over
full code for controlling the Crazyflie and reproducing the 5 s. The final models at 25 Hz and 50 Hz are trained on 2,608 and
experiments, are available online at https://sites.google.com/ 9,655 points respectively, but peak performance is earlier due to
berkeley.edu/mbrl-quadrotor/ dynamics model convergence and hardware lifetime limitations.

A. Experimental Setting C. Performance Summary

The performance of our controller is measured by the av- This controller demonstrates the ability to hover, following a
erage flight length over each roll-out. Failure is often due to “clean” open-loop takeoff, for multiple seconds (an example is
drift induced collisions, or, as in many earlier roll-outs, when shown in Figure 8). At both 25 Hz and 50 Hz, once reaching
flights reach a pitch or roll angle over 40◦ . In both cases, an maximum performance in the 12 roll-outs, about 30% of flights
emergency stop command is sent to the motors to minimize fail to drift. The failures due to drift indicate the full potential of
damage. Additionally, the simple on-board state estimator shows the MBRL solution to low-level quadrotor control. An example
heavy inconsistencies on the Euler angles following a rapid of a test flight segment is shown in Figure 7, where the control
throttle ramping, which is a potential limiting factor on the response to pitch and roll error is visible.
length of controlled flight. Notably, a quadrotor with internal The basis of comparison, typical quadrotor controllers,
PIDs enabled will still fail regularly due to drift on the same achieve better performance, but with higher control fre-
time frame as our controller; it is only with external inputs that quencies and engineering design iterations leveraging sys-
the internal controllers will obtain substantially longer flights. tem dynamics knowledge. With the continued improvement of

Fig. 6. The pitch over time for each flight in the first four roll-outs of learning at 50 Hz, showing the rapid increase in control ability on limited data. The random
and first controlled roll-out show little ability, but roll-out 3 is already flying for > 2 seconds.

Improvements to the peak performance will come by iden-

tifying causes of the performance plateau. Elements to investi-
gate include the data-limited slow down in improvement of the
dynamics model accuracy, the different collected data distribu-
tions at each roll-out, the stochasticity of NN training, and the
stochasticity at running time with MPC.
Beyond improving performance, computational burden and
safety hinder the applicability of MBRL with MPC to more
systems. The current method requires a GPU-enabled base-
station, but the computational efficiency could be improved with
intelligent action sampling methods or by combining model free
Fig. 7. The performance of the 50 Hz controller. (Above) The controlled PWM techniques, such as learning a deterministic action policy based
values over time, which visibly change in response to angle oscillations. (Below) on the learned dynamics model. We are exploring methods to
Pitch and roll.
generate NN control policies, such as an imitative-MPC network
or a model-free variant, on the dynamics model that could reduce
computational power, the performance of this method should be computation by over 1000x by only evaluating a NN once per
re-characterized as potential control frequencies approach that state measurement. In order to enhance safety, we are interested
of PID controllers. Beyond comparison to PID controllers with in defining safety constraints within the model predictive con-
low computational footprints, the results warrant exploration of troller, rather than just a safety kill-switch in firmware, opening
MBRL for new dynamical systems with or when varying goals the door to fully autonomous learned control from start to
need to be built into low level control. In less than 10 minutes finish.
of clock time, and only 3 minutes of training data, we present
comparable, but limited, performance that is encouraging for
future abilities to match and surpass basic controllers. Moving VIII. CONCLUSIONS AND FUTURE WORK
the balance of this work further towards domain specific control
would likely improve performance, but the broad potential for This work is an exploration of the capabilities of model-based
applications to more and different robotic platforms compels reinforcement learning for low-level control of an a priori un-
exciting future use of MBRL. known dynamic system. The results, with the added challenges
of the static instability and fast dynamics of the Crazyflie,
show the capabilities and future potential of MBRL. We detail
VII. DISCUSSION AND LIMITATIONS
the firmware modifications, system design, and model learning
The system has multiple factors contributing to short length considerations required to enable the use of a MBRL-based
and high variance of flights. First, the PWM equilibrium values MPC system for quadrotor control over radio. We removed all
of the motors shift by over 10% following a collision, causing the robot-specific transforms and higher level commands to only
true dynamics model to shift over time. This problem is partially design the controller on top of a learned dynamics model to
mitigated by replacing the components of the Crazyflie, but any accomplish a simple task. The controller shows the capability
change of hardware causes dynamics model mismatch and the to hover for multiple seconds at a time with less than 3 minutes
challenge persists. Additionally, the internal state estimator does of collected data – approximately half of the full battery life
not track extreme changes in Euler angles accurately. We believe flight time of a Crazyflie. With learned flight in only minutes
that overcoming the system-level and dynamical limitations of of testing, this brand of system-agnostic MBRL is an exciting
controlling the Crazyflie in this manner showcases the expres- solution not only due to its generalizability, but also due to its
sive power of MBRL. learning speed.
Authorized licensed use limited to: NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY. Downloaded on March 17,2025 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.
4230 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 4, NO. 4, OCTOBER 2019

Fig. 8. A full flight of Euler angle state data with frames of the corresponding video. This flight would have continued longer if not for drifting into the wall. The
relation between physical orientation and pitch and roll is visible in the frames. The full video is online on the accompanying website.

In parallel with addressing the limitations outlined in Sec- [13] M. Abdolhosseini, Y. Zhang, and C. A. Rabbath, “An efficient model
tion VII, the quadrotor results warrant investigation into low predictive control scheme for an unmanned quadrotor helicopter,” J. Intell.
Robot. Syst., vol. 70, no. 1–4, pp. 27–38, 2013.
level control of other robots. The emergent area of microrobotics [14] A. P. Schoellig, F. L. Mueller, and R. D’Andrea, “Optimization-based
combines the issues of under-characterized dynamics, weak or iterative learning for precise quadrocopter trajectory tracking,” Auton.
non-existent controllers, “fast” dynamics and therefore instabil- Robots, vol. 33, no. 1/2, pp. 103–127, 2012.
[15] C. Sferrazza, M. Muehlebach, and R. D’Andrea, “Trajectory tracking and
ities, and high cost-to-test [29], [30], so it is a strong candidate iterative learning on an unmanned aerial vehicle using parametrized model
for MBRL experiments. predictive control,” in Proc. IEEE Conf. Decis. Control, 2017, pp. 5186–
5192.
[16] P. Bouffard, A. Aswani, and C. Tomlin, “Learning-based model predic-
ACKNOWLEDGMENT tive control on a quadrotor: Onboard implementation and experimen-
tal results,” in Proc. IEEE Int. Conf. Robot. Autom., 2012, pp. 279–
The authors would like to thank the UC Berkeley Sen- 284.
sor & Actuator Center (BSAC), Berkeley DeepDrive, and Nvidia [17] T. Koller, F. Berkenkamp, M. Turchetta, and A. Krause, “Learning-based
Inc. model predictive control for safe exploration,” in Proc. IEEE Conf. Decis.
Control, 2018, pp. 6059–6066.
[18] F. Berkenkamp, A. P. Schoellig, and A. Krause, “Safe controller optimiza-
REFERENCES tion for quadrotors with Gaussian processes,” in Proc. IEEE Int. Conf.
Robot. Autom., 2016, pp. 491–496.
[1] M. P. Deisenroth, D. Fox, and C. E. Rasmussen, “Gaussian processes for [19] S. Bansal, R. Calandra, T. Xiao, S. Levine, and C. J. Tomlin, “Goal-driven
data-efficient learning in robotics and control,” IEEE Trans. Pattern Anal. dynamics learning via Bayesian optimization,” in Proc. IEEE Conf. Decis.
Mach. Intell., vol. 37, no. 2, pp. 408–423, Feb. 2015. Control, 2017, pp. 5168–5173.
[2] G. Williams et al., “Information theoretic MPC for model-based reinforce- [20] I. Clavera, A. Nagabandi, R. S. Fearing, P. Abbeel, S. Levine, and C.
ment learning,” in Proc. Int. Conf. Robot. Autom., 2017, pp. 1714–1721. Finn, “Learning to adapt: Meta-learning for model-based control,” 2018,
[3] K. Chua, R. Calandra, R. McAllister, and S. Levine, “Deep reinforcement arXiv:1803.11347.
learning in a handful of trials using probabilistic dynamics models,” in [21] A. Kupcsik, M. P. Deisenroth, J. Peters, A. P. Loh, P. Vadakkepat, and
Proc. Int. Conf. Neural Inf. Process. Syst., 2018, pp. 4759–4770. G. Neumann, “Model-based contextual policy search for data-efficient
[4] J. Hwangbo, I. Sa, R. Siegwart, and M. Hutter, “Control of a quadrotor generalization of robot skills,” Artif. Intell., vol. 247, pp. 415–439,
with reinforcement learning,” IEEE Robot. Autom. Lett., vol. 2, no. 4, 2017.
pp. 2096–2103, Oct. 2017. [22] P. Abbeel, Apprenticeship Learning and Reinforcement Learning With
[5] S. Bansal, A. K. Akametalu, F. J. Jiang, F. Laine, and C. J. Tomlin, Application to Robotic Control. Stanford, CA, USA: Stanford Univ., 2008.
“Learning quadrotor dynamics using neural network for flight control,” [23] A. Punjani and P. Abbeel, “Deep learning helicopter dynamics
in Proc. IEEE Conf. Decis. Control, 2016, pp. 4653–4660. models,” in Proc. IEEE Int. Conf. Robot. Autom., May 2015,
[6] R. Mahony, V. Kumar, and P. Corke, “Multirotor aerial vehicles,” IEEE pp. 3223–3230.
Robot. Autom. Mag., vol. 19, no. 3, pp. 20–32, Sep. 2012. [24] A. Nagabandi et al., “Learning image-conditioned dynamics models for
[7] D. Mellinger, N. Michael, and V. Kumar, “Trajectory generation and control of underactuated legged millirobots,” in Proc. IEEE Int. Conf.
control for precise aggressive maneuvers with quadrotors,” Int. J. Robot. Intell. Robots Syst., 2018, pp. 4606–4613.
Res., vol. 31, no. 5, pp. 664–674, 2012. [25] A. Bitcraze, “Crazyflie 2.0,” 2016. [Online]. Available: https://www.
[8] W. Zhang, M. W. Mueller, and R. D’Andrea, “A controllable flying vehicle bitcraze.io/crazyflie-2/
with a single moving part,” in Proc. IEEE Int. Conf. Robot. Autom., 2016, [26] W. Hönig and N. Ayanian, “Flying multiple UAVs using ROS,” in Robot
pp. 3275–3281. Operating System. New York, NY, USA: Springer, 2017, pp. 83–118.
[9] M. W. Mueller and R. D’Andrea, “Stability and control of a quadrocopter [27] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
despite the complete loss of one, two, or three propellers,” in Proc. IEEE 2014, arXiv:1412.6980.
Int. Conf. Robot. Autom., 2014, pp. 45–52. [28] P. Ramachandran, B. Zoph, and Q. V. Le, “Swish: A self-gated activation
[10] M. Ryll, H. H. Bülthoff, and P. R. Giordano, “Modeling and control of a function,” 2017, arXiv:1710.05941.
quadrotor UAV with tilting propellers,” in Proc. IEEE Int. Conf. Robot. [29] D. S. Drew, N. O. Lambert, C. B. Schindler, and K. S. Pister, “Toward
Autom., 2012, pp. 4606–4613. controlled flight of the ionocraft: A flying microrobot using electrohydro-
[11] H. Liu, D. Li, J. Xi, and Y. Zhong, “Robust attitude controller design for dynamic thrust with onboard sensing and no moving parts,” IEEE Robot.
miniature quadrotors,” Int. J. Robust Nonlinear Control, vol. 26, no. 4, Autom. Lett., vol. 3, no. 4, pp. 2807–2813, Oct. 2018.
pp. 681–696, 2016. [30] D. S. Contreras, D. S. Drew, and K. S. Pister, “First steps of a millimeter-
[12] M. Bangura and R. Mahony, “Real-time model predictive control for scale walking silicon robot,” in Proc. Int. Conf. Solid-State Sensors,
quadrotors,” Int. Fed. Autom. Control Proc. Vol., vol. 47, pp. 11773–11780, Actuators Microsyst., 2017, pp. 910–913.
2014.

Authorized licensed use limited to: NANJING UNIVERSITY OF SCIENCE AND TECHNOLOGY. Downloaded on March 17,2025 at 05:38:42 UTC from IEEE Xplore. Restrictions apply.

SW Explorers Remote Control Quadcopter Instruction Manual
100% (1)
SW Explorers Remote Control Quadcopter Instruction Manual
16 pages
Control of A Quadrotor With Reinforcement Learning: Jemin Hwangbo, Inkyu Sa, Roland Siegwart and Marco Hutter
No ratings yet
Control of A Quadrotor With Reinforcement Learning: Jemin Hwangbo, Inkyu Sa, Roland Siegwart and Marco Hutter
8 pages
Learning To Fly in Seconds: Jonas Eschmann, Dario Albani, and Giuseppe Loianno
No ratings yet
Learning To Fly in Seconds: Jonas Eschmann, Dario Albani, and Giuseppe Loianno
8 pages
[2] Acceleration-based Quadrotor Guidance Under Time Delays Using Deep Reinforcement Learning
No ratings yet
[2] Acceleration-based Quadrotor Guidance Under Time Delays Using Deep Reinforcement Learning
22 pages
Learning_Model_Predictive_Control_for_Quadrotors
No ratings yet
Learning_Model_Predictive_Control_for_Quadrotors
7 pages
2502.20500v1
No ratings yet
2502.20500v1
14 pages
Reinforcement Learning Control of An Aerial Robot Based On A Tuned Proximal Policy Optimization in Takeoff and Hover Phases
No ratings yet
Reinforcement Learning Control of An Aerial Robot Based On A Tuned Proximal Policy Optimization in Takeoff and Hover Phases
7 pages
Reinforcement Learning Based Quadcopter Controller
No ratings yet
Reinforcement Learning Based Quadcopter Controller
7 pages
Reinforced Learning-Based Robust Control Design For Unmanned Aerial Vehicle
No ratings yet
Reinforced Learning-Based Robust Control Design For Unmanned Aerial Vehicle
16 pages
Gabe
No ratings yet
Gabe
18 pages
Scirobotics
No ratings yet
Scirobotics
9 pages
Autonomous Drone Racing With Deep Reinforcement Learning
No ratings yet
Autonomous Drone Racing With Deep Reinforcement Learning
9 pages
s10846-017-0468-y
No ratings yet
s10846-017-0468-y
21 pages
DeepLearningControl
No ratings yet
DeepLearningControl
8 pages
[4] Learned Multiagent Real-Time Guidance with Applications to Quadrotor Runway Inspection
No ratings yet
[4] Learned Multiagent Real-Time Guidance with Applications to Quadrotor Runway Inspection
29 pages
FLC TRTC PDF
No ratings yet
FLC TRTC PDF
7 pages
3.reinforcement Learning DDPG-PPO Agent-Based Control S Ystem
No ratings yet
3.reinforcement Learning DDPG-PPO Agent-Based Control S Ystem
14 pages
Dhruv Anirudh DrSandeep (3)
No ratings yet
Dhruv Anirudh DrSandeep (3)
21 pages
Zhou Et Al 2020 Deep Neural Networks As Add On Modules For Enhancing Robot Performance in Impromptu Trajectory Tracking
No ratings yet
Zhou Et Al 2020 Deep Neural Networks As Add On Modules For Enhancing Robot Performance in Impromptu Trajectory Tracking
22 pages
Using Q-Learning To Automatically Tune Quadcopter PID Controller Online For Fast Altitude Stabilization
No ratings yet
Using Q-Learning To Automatically Tune Quadcopter PID Controller Online For Fast Altitude Stabilization
6 pages
A Deep Reinforcement Learning Control Approach For High-Performance Aircraft
No ratings yet
A Deep Reinforcement Learning Control Approach For High-Performance Aircraft
41 pages
(Payload) - Complexity - 2019 - Sierra - Wind and Payload Disturbance Rejection Control Based On Adaptive Neural Estimators
No ratings yet
(Payload) - Complexity - 2019 - Sierra - Wind and Payload Disturbance Rejection Control Based On Adaptive Neural Estimators
20 pages
Finite-Time Backstepping Control For Quadrotors With Disturbances and Input Constraints
No ratings yet
Finite-Time Backstepping Control For Quadrotors With Disturbances and Input Constraints
13 pages
Symmetric Actor-Critic Deep Reinforcement Learning For Cascade Quadrotor
No ratings yet
Symmetric Actor-Critic Deep Reinforcement Learning For Cascade Quadrotor
13 pages
A Safe Reinforcement Learning Driven Weights-Varying Model Predictive Control For Autonomous Vehicle Motion Control
No ratings yet
A Safe Reinforcement Learning Driven Weights-Varying Model Predictive Control For Autonomous Vehicle Motion Control
8 pages
Slope Handling For Quadruped Robots Using Deep Reinforcement Learning and Toe Trajectory Planning
No ratings yet
Slope Handling For Quadruped Robots Using Deep Reinforcement Learning and Toe Trajectory Planning
6 pages
Bioinspired Backstepping Sliding Mode Control and Adaptive Sliding
No ratings yet
Bioinspired Backstepping Sliding Mode Control and Adaptive Sliding
10 pages
robotics1
No ratings yet
robotics1
17 pages
Data Driven Control IEEE Paper
No ratings yet
Data Driven Control IEEE Paper
4 pages
Admin,+24426 72789 1 SM PDF
No ratings yet
Admin,+24426 72789 1 SM PDF
15 pages
IROS21_Yunlong
No ratings yet
IROS21_Yunlong
8 pages
1-s2.0-S1367578823000135-main
No ratings yet
1-s2.0-S1367578823000135-main
16 pages
Mellinger ICRA2011
No ratings yet
Mellinger ICRA2011
6 pages
Reinforcement Learning For The Control of Quadcopters
No ratings yet
Reinforcement Learning For The Control of Quadcopters
5 pages
He 等 - 2021 - Explainable Deep Reinforcement Learning for UAV Autonomous Navigation
No ratings yet
He 等 - 2021 - Explainable Deep Reinforcement Learning for UAV Autonomous Navigation
12 pages
Real-Time_Neural_MPC_Deep_Learning_Model_Predictive_Control_for_Quadrotors_and_Agile_Robotic_Platforms
No ratings yet
Real-Time_Neural_MPC_Deep_Learning_Model_Predictive_Control_for_Quadrotors_and_Agile_Robotic_Platforms
8 pages
Effective Probabilistic Neural Networks Model for Model-based Reinforcement Learning USV
No ratings yet
Effective Probabilistic Neural Networks Model for Model-based Reinforcement Learning USV
17 pages
Project Report (3)
No ratings yet
Project Report (3)
11 pages
hjb-rl
No ratings yet
hjb-rl
9 pages
IIST_controls (1)
No ratings yet
IIST_controls (1)
75 pages
Model-based Deep Reinforcement Learning for Robotic Systems
No ratings yet
Model-based Deep Reinforcement Learning for Robotic Systems
146 pages
2403.12203v3
No ratings yet
2403.12203v3
15 pages
Trustworthy Reinforcement Learning For Quadrotor UAV Tracking Control Systems
No ratings yet
Trustworthy Reinforcement Learning For Quadrotor UAV Tracking Control Systems
14 pages
TinyMPC: Model-Predictive Control on Resource-Constrained Microcontrollers
No ratings yet
TinyMPC: Model-Predictive Control on Resource-Constrained Microcontrollers
7 pages
Cutler16_ICRA_final_submission
No ratings yet
Cutler16_ICRA_final_submission
7 pages
Automated-aerial-suspended-cargo-delivery-through-rein_2017_Artificial-Intel
No ratings yet
Automated-aerial-suspended-cargo-delivery-through-rein_2017_Artificial-Intel
18 pages
Learn-to-Recover Retrofitting UAVs With Reinforcement Learning-Assisted Flight Control Under Cyber-Physical Attacks
No ratings yet
Learn-to-Recover Retrofitting UAVs With Reinforcement Learning-Assisted Flight Control Under Cyber-Physical Attacks
7 pages
Biomimetics 08 00434
No ratings yet
Biomimetics 08 00434
26 pages
Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators
No ratings yet
Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators
8 pages
Self 2019
No ratings yet
Self 2019
6 pages
Quadcopter Neural Controller For Take Off and Land - 2023 - Expert Systems With
No ratings yet
Quadcopter Neural Controller For Take Off and Land - 2023 - Expert Systems With
15 pages
Guassian Process of MPC
No ratings yet
Guassian Process of MPC
16 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Thirdpaper
No ratings yet
Thirdpaper
17 pages
Incorporating Recurrent Reinforcement Learning Into Model Predictive
No ratings yet
Incorporating Recurrent Reinforcement Learning Into Model Predictive
7 pages
Controlling An Autonomous Vehicle With Deep Reinforcement Learning
No ratings yet
Controlling An Autonomous Vehicle With Deep Reinforcement Learning
7 pages
Learning To Fly Thesis
No ratings yet
Learning To Fly Thesis
111 pages
Design_and_Real-Time_Implementation_of_a_Cascaded_
No ratings yet
Design_and_Real-Time_Implementation_of_a_Cascaded_
20 pages
Enhanced Backstepping Controller Design With Application
No ratings yet
Enhanced Backstepping Controller Design With Application
27 pages
Cessna 2
No ratings yet
Cessna 2
20 pages
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
From Everand
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Dynamics Modeling and Control of Quadrotor Vehicle: January 2012
No ratings yet
Dynamics Modeling and Control of Quadrotor Vehicle: January 2012
11 pages
Application of UAV
No ratings yet
Application of UAV
22 pages
Amit Singla: Curriculum Vitae
No ratings yet
Amit Singla: Curriculum Vitae
4 pages
Online Courses To Learn Robotics For FREE
No ratings yet
Online Courses To Learn Robotics For FREE
5 pages
Mathematical and Dynamical Modelling of A Quadcopter
No ratings yet
Mathematical and Dynamical Modelling of A Quadcopter
17 pages
Multirotor Design and Applications
No ratings yet
Multirotor Design and Applications
53 pages
Components of A Quadcopter: SYST 460
100% (1)
Components of A Quadcopter: SYST 460
26 pages
0071822283quadcopter PDF
No ratings yet
0071822283quadcopter PDF
364 pages
DRONE - November 2017 PDF
No ratings yet
DRONE - November 2017 PDF
100 pages
Performance of Coaxial Propulsion in Design of Multi-Rotor Uavs
No ratings yet
Performance of Coaxial Propulsion in Design of Multi-Rotor Uavs
10 pages
Optimization of Quadcopter Propeller Aerodynamics Using Blade Element and Vortex Theory
No ratings yet
Optimization of Quadcopter Propeller Aerodynamics Using Blade Element and Vortex Theory
49 pages
T 25
No ratings yet
T 25
28 pages
Quadcopter controlling
No ratings yet
Quadcopter controlling
13 pages
HoverabilityNew Tatsuya-Ibuki
No ratings yet
HoverabilityNew Tatsuya-Ibuki
15 pages
Dr. Eng. Mustafa Turki Hussein Aymen S. Tarish Zainab Aziz
No ratings yet
Dr. Eng. Mustafa Turki Hussein Aymen S. Tarish Zainab Aziz
36 pages
Quadcopter: Design, Construction and Testing: March 2019
No ratings yet
Quadcopter: Design, Construction and Testing: March 2019
8 pages
Agriculture Sprayer
100% (1)
Agriculture Sprayer
50 pages
Quadcopter PDF
No ratings yet
Quadcopter PDF
56 pages
Quadcopters: Presented By: Andrew Depriest
No ratings yet
Quadcopters: Presented By: Andrew Depriest
20 pages
Simulation Platform For Quadrotor
No ratings yet
Simulation Platform For Quadrotor
2 pages
Physics Project 12-Vishal PDF
100% (1)
Physics Project 12-Vishal PDF
24 pages
XK X380 User Manual
No ratings yet
XK X380 User Manual
11 pages
EC-09-14 ECE590 Final Report
No ratings yet
EC-09-14 ECE590 Final Report
116 pages
SAJAG
0% (1)
SAJAG
28 pages
Fabrication of A Drone: A Major Project Report On
50% (2)
Fabrication of A Drone: A Major Project Report On
3 pages
Paintcopter: An Autonomous Uav For Spray Painting On 3D Surfaces
No ratings yet
Paintcopter: An Autonomous Uav For Spray Painting On 3D Surfaces
9 pages
7.hettiarachchi Maheth K1751549 Final Report
No ratings yet
7.hettiarachchi Maheth K1751549 Final Report
52 pages
Regulatory Control of Quadcopter by Designing Third Order SMC Controller
No ratings yet
Regulatory Control of Quadcopter by Designing Third Order SMC Controller
8 pages

Low-Level Control of a Quadrotor With Deep

Uploaded by

Low-Level Control of a Quadrotor With Deep

Uploaded by

4224 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 4, NO.

Low-Level Control of a Quadrotor With Deep

Abstract—Designing effective low-level robot controllers of-

signals and did not include experimental validation through

for sampling and communication, represents an uncharacterized

A. Experimental Setting C. Performance Summary

Improvements to the peak performance will come by iden-

You might also like