Real-Time_Neural_MPC_Deep_Learning_Model_Predictive_Control_for_Quadrotors_and_Agile_Robotic_Platforms
Real-Time_Neural_MPC_Deep_Learning_Model_Predictive_Control_for_Quadrotors_and_Agile_Robotic_Platforms
2377-3766 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
2398 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 8, NO. 4, APRIL 2023
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
SALZMANN et al.: REAL-TIME NEURAL MPC: DEEP LEARNING MODEL PREDICTIVE CONTROL FOR QUADROTORS 2399
Our work is inspired by [2], [3], [4], [5], [6], [7] but replaces program (QP). The solution to the QP leads to an update on
the Gaussian Process dynamics of [3], [4] or the small neural the iterate ω i+1 = ω i + Δω i where the step Δω i is given by
networks of [5], [6] with networks of higher modeling capac- solving the following QP
ity [1], [2] and uses gradient-based optimization as opposed to a N
−1
sampling-based scheme [7]. The resulting framework allows a qk Δxk Δxk Δxk
min + Hk
combination of the versatile modeling capabilities of deep neural Δω i rk Δuk Δuk Δuk
k=0
networks with state-of-the-art embedded optimization software
without tightly constraining the choice of network architecture. subject to
Δxk+1 = Ak Δxk + Bk Δuk + φ̄k − xk+1 , (2)
III. PROBLEM SETUP
k = 0, . . . , N − 1,
In its most general form, MPC solves an optimal control
problem (OCP) by finding an input command u which mini- − ḡk ≥ Gxk Δxk + Guk Δuk , k = 0, . . . , N ,
mizes a cost function L subject to its system dynamics model (3)
ẋ = f (x, u) while accounting for constraints on input and where qk = δxδ i L(xik , uik ), rk = δuδ i L(xik , uik ) linearize the
state variables for current and future timesteps. Traditionally, k k
the model f is manually derived from first principles using cost function and, under given circumstances, the hessian Hk
“simple” differential-algebraic equations (DAE) which often can be approximated by the Gauss-Newton algorithm. φ̄k
neglect complicated dynamics effects such as aerodynamics and ḡk are shorthand notations for the function evaluations
or friction as they are hard or computationally expensive to φ(xik , uik , f, δt) and g(xik , uik ). The main computational burden
formalize. Following prior works [2], [3], [4], [5], we partition lies in the parameter computation of the continuity condition
f into a mathematical combination of first principle DAEs fF (2). Specifically for each shooting node k = 0, . . . , N − 1 we
and a learned data-driven model fD . This enables more general need to compute
models extending the capability of DAE dynamics models. To δ δ
Ak = i φ(xik , uik , f, δt), Bk = φ(xik , uik , f, δt),
solve the aforementioned OCP, we approximate it by discretizing δxk δuik
the system into N steps of step size δt over a time horizon T
using direct multiple shooting [25] which leads to the following φ̄k = φ(xik , uik , f, δt).
nonlinear programming (NLP) problem Leading to N ∗ E ∗ 2 evaluations of the partial differentiations
N
−1 δf (x, u) = δfN (x, u) + δfD (x, u)
min L(xk , uk ) and N ∗ E function evaluations
u
k=0
f (x, u) = fN (x, u) + fD (x, u)
subject to xk=0 = x0
of the dynamics equation. For computational heavy data-driven
xk+1 = φ(xk , uk , f, δt) dynamics models fD this leads to extensive processing times
generating the QP.
f (xk , uk ) = fF (xk , uk ) + fD (xk , uk ) The learned data-driven dynamics fD are assumed to be
g(xk , uk ) ≤ 0 (1) accurate over the entire input space of states and controls present
in the training dataset. However, to create the QP continuity
where x0 denotes the initial condition and g can incorporate (in-) condition we only require the model and its differentiations
equality constraints, such as bounds in state and input variables. to be accurate in and around specific input values ω i . Thus,
φ is the numerical integration routine to discretize the dynamics to speed up the QP generation we replace the computationally
equation where commonly a 4th order Runge-Kutta algorithm heavy globally valid data-driven dynamics equation fD with a
is used involving E = 4 evaluations of the dynamics function computationally light locally valid approximation up to second
f . To leverage advancements in embedded solvers, the NLP is order around the current iterate
optimized using sequential quadratic programming (SQP) with
ω being the SQP iterate ω i = [xi0 , ui0 , . . . , xiN −1 , uiN −1 ]. ∗ i i x − xik
fD (x, u) ≈ f̄D + JD,k
u − uik
IV. BRINGING NEURAL MPC TO ONBOARD REAL-TIME
1 x − xik i x − xik
In this section, we lay down the key concepts to speed up the + HD,k . (4)
2 u − uik u − uik
optimization times of MPC control with neural networks. The
key insight in Section IV-A is that local approximations of the The required differentiations are readily available as submatri-
learned dynamics are sufficient to keep alike performance while ces of JiD,k for first-order approximations or as submatrices of a
drastically improving the generation process of the optimization Tensor multiplication and sum for second-order approximations.
problem. This insight is utilized in a three-phased embedded The induced error of this computational simplification is of
real-time optimization procedure in Section IV-B. second order for a first-order approximation and of third order
for a second-order approximation in the size of state and control
changes between nodes. We will experimentally demonstrate
A. Locally Approximated Continuity Quadratic Program this error to be neglectable for agile platforms where δt is small
Due to advances in embedded optimization solvers, SQP has in Section VII.
become a well-suited framework to efficiently solve NLPs re- Applying (4), the QP creation becomes independent of the
sulting from multiple shooting approximations of OCPs. This in- complexity and architecture of the data-driven dynamics model.
volves repetitively approximating and solving (1) as a quadratic Further, with JiD,k and HiD,k being the single interfaces between
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
2400 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 8, NO. 4, APRIL 2023
Fig. 2. Data flow for our RTN-MPC algorithm. The data-driven (DD) prepa- Fig. 3. Evaluation of real-time capability for different two-layer model para-
ration phase is performed efficiently using optimized machine learning batch- metric capacities. We evaluate on an embedded platform (Nvidia Jetson Xavier
differentiation tools on CPU or GPU. NX) and a laptop machine (Intel i7, Nvidia RTX 3000). Parametric model capac-
ity is approximated by the squared number of neurons per layer. The RTN-MPC
framework can run 4000 times larger models in parametric complexity compared
to a naive implementation. To make the results comparable, we define a target
the SQP optimization and the data-driven dynamics model, we run-time window of at least 50 Hz (dashed red line) and preferably over 100 Hz
(dashed green line). However, in a real-world scenario the real-time window is
are free to optimize the approximation process independent of specific to the use-case.
the NLP framework; passing them as parameters to the continu-
ity condition procedure of the QP generation. As fD is a neural
network model commonly consisting of large matrix multiplica-
tions we are therefore free to use algorithms and hardware opti- arbitrary neural network models, trainable in PyTorch and usable
mized for neural network evaluation and differentiation. Those in CasADi.
capabilities are readily available in modern machine learning Further, we will compare our RTN-MPC approach against
tools such as PyTorch [26] and TensorFlow [27]. This enables a naive implementation of a neural network data-driven MPC
us to calculate the Jacobians and Hessians for all shooting nodes as applied in [2], [5], [6]. Here, the learned model is directly
N as a single parallelized batch on CPU or GPU. constructed in CasADi in the form of trained weight matrices
and activation functions. Subsequently, the QP generation and
automatic differentiation engine in CasADi has to deal with the
B. Real-Time Neural MPC
full neural-network structure for which it is lacking optimized
Even without a data-driven dynamics model, solving the SQP algorithms while being confined to the CPU.
until convergence is computationally too costly in real-time for
agile robotic platforms. To account for this shortcoming, MPC V. RUNTIME ANALYSIS
applications subjected to fast dynamics are commonly solved
using a real-time-iteration scheme (RTI) [28], where only a We demonstrate the computational advantage of our proposed
single SQP iteration is executed - one quadratic problem is con- RTN-MPC paradigm compared to a naive implementation of a
structed and solved as a potentially sub-optimal but timely input data-driven dynamics model in online MPC. Thus, we construct
command is preferred over an optimal late one. As shown in an experimental problem in which the nominal dynamics is
Fig. 2, RTN-MPC divides the real-time optimization procedure trivial while the data-driven dynamics can be arbitrarily scaled
into three parts: QP Preparation Phase, Data-Driven Dynamics in computational complexity. As such the nominal dynamics
Preparation Phase and Feedback Response. model is a double integrator on a position p while the data-
With available iterate ω i , the data-driven dynamics prepa- driven dynamics is a neural network of variable architecture. To
ration phase calculates f¯Di and JiD,k using efficient batched solely focus on the computational complexity of the data-driven
differentiates of the data-driven dynamics on CPU or GPU. dynamics, rather than modeling accuracy, the networks are not
Meanwhile, the QP preparation phase constructs a QP by trained but weights are manually adjusted to force a zero output.
linearizing around xi and control ui using a first-order approx- ṗ ṗ
imation fD∗ (x, u) for the continuity condition parametrized by ẋ = = fF (x, u) = ,
p̈ u
the result of the data-driven dynamics preparation phase.
Once a new disturbed state xk=0 is sensed, the feedback re-
0
sponse phase solves the pre-constructed QP using the disturbed f (x, u) = fF (x, u) + fD (x, u) . (5)
state as input. The iterate ω is adjusted with the QP result and
the optimized command u is sent to the actuators. We use an explicit Runge-Kutta method of 4th order
φ(x, u, f, δt) = RK4(x, u, f, δt) to numerically integrate f .
In this experiment, we simulate the system without any model-
C. Implementation plant-mismatch to focus solely on runtime. The optimization
To demonstrate the applicability of the RTN-MPC paradigm, problem is solved by constructing the multiple shooting scheme
we provide a implementation1 using CasADi [29] and aca- with N = 10 nodes.
dos [30] as the optimization framework and PyTorch [26] as Fig. 3 compares two-layer networks with increasing neuron
ML framework. This enables the research community to use count for a naive implementation and our RTN-MPC framework.
On an embedded system, such as the Nvidia Jetson Xavier NX,
our approach enables larger models of factor 60 in parametric
1 Framework Code: https://github.com/TUM-AAS/ml-casadi complexity on CPU and of factor 4000 on GPU while staying
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
SALZMANN et al.: REAL-TIME NEURAL MPC: DEEP LEARNING MODEL PREDICTIVE CONTROL FOR QUADROTORS 2401
TABLE II
RUNTIME COMPARISON BETWEEN NAIVE IMPLEMENTATION AND RTN-MPC
Fig. 4. Quadrotor model with world and body frames and propeller numbering
convention. Grey arrows indicate the spinning direction of the individual rotors.
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
2402 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 8, NO. 4, APRIL 2023
VII. EXPERIMENTS
In our experiments we will re-validate the findings of previ-
ous works [2], [5] that using neural-network data-driven mod-
els in MPC improves tracking performance compared to no a non-augmented MPC controller, a naive integration of data-
data-driven models or Gaussian Processes. More importantly, driven dynamics [2], [5], [6], and GPs [3] with respect to
however, we will demonstrate that RTN-MPC enables the use of real-time capability and model capacity.
larger network capacities to fully exhaust possible performance Simplified Quadrotor Simulation: We use the simulation
gains while providing real-time capabilities. framework described in [3], where perfect odometry measure-
All our experiments are divided into two phases: system ments and ideal tracking of the commanded single rotor thrusts
identification and evaluation. During system identification, we are assumed. Drag effects by the rotors and fuselage are sim-
collect data using the nominal dynamics model in the MPC ulated, as well as zero mean (σ = 0.005) constant Gaussian
controller. The state-control-timeseries are further processed in noise on forces and torques, and zero mean Gaussian noise
subsequent state, control tuples. Each step is then re-simulated on motor voltage signals with standard
√ deviation proportional
using the nominal controller and the error is used as the training to the input magnitude σ = 0.02 u. There are no run-time
label for the residual model. constraints as controller and simulator are run sequentially in
During evaluation we track two fixed evaluation trajectories, simulated time. Using the simplified simulation, we analyze the
Circle and Lemniscate, and measure the performance based on predictive performance and run-time of our approach for varying
the reference position tracking error. As such, we report the network sizes and directly compare to the naive implementa-
(Mean) Euclidean Distance between the reference trajectory and tion and Gaussian Process approach. We constrain the residual
the tracked trajectory as error. model to linear accelerations f Da to facilitate comparison with
To identify model architectures used in the experiments we prior work [3]. To fairly evaluate the run-times of our full and
use a naming convention stating the model type followed by distributed approach and considering the limited resources of
the size and the implementation type where we differentiate embedded systems this experiment was performed on a single
between our RTN-MPC approach (-Ours) and a naive integration CPU core. The results are depicted in Table III. We also compare
(-Naive). N-3-32-Ours is a neural network model with 3 hidden with a Nominal model where no learned residuals are modeled in
layers, 32 neurons each using our RTN-MPC framework and the dynamics function and we also compare with an oracle-like
N-3-32-Naive using a naive integration. GP-20 is a Gaussian Perfect model which uses the same dynamics equations as the
Process Model with 20 inducing points. simulation (excluding noise). Neural networks which achieve
All of our learned dynamic models are trained with a batch accurate modeling performance on the simulated dynamics are
size of 64 and a learning rate of 1e−4 using the Adam optimizer. integrated easily with real-time optimization times below 3 ms
We split all datasets into a training and validation part and using our approach while they have high optimization times (up
train the models using early stopping on the validation set. to 36 ms) when a naive integration approach is used. The local
Dataset sizes are 20 k datapoints for the simple simulation approximations described in Section IV-A do not negatively
environment, 200 k for the BEM simulation environment, and influence performance compared to a naive implementation.
we use the openly available dataset presented in [1] with 1.8 Furthermore, we demonstrate that such modeling performance
million datapoints for the real-world experiment. is not reachable with a GP even when using a large number of
When comparing against GPs we follow the original im- supporting points.
plementation of [3] for the f Da model configuration where BEM Quadrotor Simulation: In addition to the simplified
one single-input-single-output GP is trained per dimension. For simulation setting, we also evaluate our approach in a highly
the f Da,u configuration, their implementation is extended to a accurate aerodynamics simulator based on Blade-Element-
multi-input-single-output GP per dimension. Momentum-Theory (BEM) [1]. In contrast to the simplified
simulation setting, this simulation can accurately model lift
and drag produced by each rotor from the current ego-motion
A. Simulation
of the platform and the individual rotor speeds. The simulator
We use two simulation environments featuring varying mod- runs in real-time and communicates with the controller via the
eling accuracy and real-time requirements to compare against Robot Operating System (ROS). We target a real-time control
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
SALZMANN et al.: REAL-TIME NEURAL MPC: DEEP LEARNING MODEL PREDICTIVE CONTROL FOR QUADROTORS 2403
TABLE IV
RESULTS FOR THE REAL-WORLD EXPERIMENT
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.
2404 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 8, NO. 4, APRIL 2023
An open challenge, which is not yet considered in this work, [13] J. Nubert, J. Köhler, V. Berenz, F. Allgöwer, and S. Trimpe, “Safe and
but the authors plan to tackle in the future, is to use a historic fast tracking on a robot manipulator: Robust MPC and neural network
sequence of states and control input in a learned dynamics control,” IEEE Robot. Automat. Lett., vol. 5, no. 2, pp. 3050–3057,
Apr. 2020.
model. This would naturally lead to incorporating sequential [14] D. Wang et al., “Model predictive control using artificial neural network
and temporal models such (LSTMs, GRUs, and TCNs) in the for power converters,” IEEE Trans. Ind. Electron., vol. 69, no. 4, pp. 3689–
optimization loop using our approach and would give rise to 3699, Apr. 2022.
running approaches currently only feasible in simulation [1] in [15] R. Winqvist, A. Venkitaraman, and B. Wahlberg, “On training and eval-
uation of neural network approaches for model predictive control,” 2020,
embedded MPC real-time. arXiv:2005.04112.
We experimentally show that the controller’s performance is [16] E. Kaufmann, A. Loquercio, R. Ranftl, M. Müller, V. Koltun, and D. Scara-
not negatively affected by the real-time inducing approxima- muzza, “Deep drone acrobatics,” in Proc. 13th Int. Joint Conf. Artif. Int.,
tions. Thus, this method overcomes the limitation of having Z. H. Zhou, Ed. Aug. 2021, pp. 4780–4783, doi: 10.24963/ijcai.2021/650.
[17] M. Henaff, A. Canziani, and Y. LeCun, “Model-predictive policy learning
to sacrifice performance for efficiency as described in previous
with uncertainty regularization for driving in dense traffic,” in Proc.
works [2], [5]. We demonstrate its usefulness by evaluating the Int. Conf. Learn. Representations, 2019. [Online]. Available: https://
isolated real-time capability of RTN-MPC on different devices openreview.net/forum?id=HygQBn0cYm
and applying the framework to the challenging problem of trajec- [18] S. Bansal, A. K. Akametalu, F. J. Jiang, F. Laine, and C. J. Tomlin,
tory tracking of a highly agile quadrotor; reducing the tracking “Learning quadrotor dynamics using neural network for flight control,”
in Proc. IEEE Conf. Decis. Control Inst. Elect. Electron. Eng. Inc, 2016,
error substantially while using powerful models on-device. pp. 4653–4660.
[19] A. Punjani and P Abbeel, “Deep learning helicopter dynamics models,” in
Proc. Int. Conf. Robot. Automat., 2015, pp. 3223–3230.
ACKNOWLEDGMENT [20] Z. Li, N. B. Kovachki, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, and
We would like to thank Matteo Zallio for his help in visually A. Anandkumar, “Fourier neural operator for parametric partial differential
equations,” in Proc. Int. Conf. Learn. Representations, 2020.
communicating our work. [21] N. A. Spielberg, M. Brown, N. R. Kapania, J. C. Kegelman, and J. C.
Gerdes, “Neural network vehicle models for high-performance automated
driving,” Sci. Robot., vol. 4, no. 28, 2019, Art. no.eaaw1975.
REFERENCES [22] J. Hwangbo et al., “Learning agile and dynamic motor skills for legged
robots,” Sci. Robot., vol. 4, no. 26, 2019, Art. no. eaau5872.
[1] L. Bauersfeld, E. Kaufmann, P. Foehn, S. Sun, and D. Scaramuzza,
[23] I. Lenz, R. Knepper, and A. Saxena, “DeepMPC: Learning deep latent fea-
“NeuroBEM: Hybrid aerodynamic quadrotor model,” in Proc. Robot.: Sci.
tures for model predictive control,” in Proc. Robot.: Sci. Syst. Robot.: Sci.
Syst., 2021. [Online]. Available: http://arxiv.org/abs/2106.08015
Syst. Found., 2015. [Online]. Available: http://www.roboticsproceedings.
[2] A. Saviolo, G. Li, and G. Loianno, “Physics-inspired temporal learning
org/rss11/p12.pdf
of quadrotor dynamics for accurate model predictive trajectory tracking,”
[24] O. M. Andrychowicz et al., “Learning dexterous in-hand manipulation,”
IEEE Robot. Automat. Lett., vol. 7, no. 4, pp. 10256–10263, Oct. 2022.
Int. J. Robot. Res., vol. 39, no. 1, pp. 3–20, 2020.
[Online]. Available: https://ieeexplore.ieee.org/document/9834096/
[25] H. Bock and K. Plitt, “A multiple shooting algorithm for direct solu-
[3] G. Torrente, E. Kaufmann, P. Foehn, and D. Scaramuzza, “Data-driven
tion of optimal control problems,” IFAC Proc. Volumes, vol. 17, no. 2,
MPC for quadrotors,” IEEE Robot. Automat. Lett., vol. 6, no. 2,
pp. 1603–1608, 1984. [Online]. Available: https://linkinghub.elsevier.
pp. 3769–3776, Apr. 2021. [Online]. Available: http://arxiv.org/abs/2102.
com/retrieve/pii/S1474667017612059
05773
[26] A. Paszke et al., “PyTorch: An imperative style, high-performance deep
[4] J. Kabzan, L. Hewing, A. Liniger, and M. N. Zeilinger, “Learning-based
learning library,” in Proc. 33rd Int. Conf. Neural Inf. Process. Syst., 2019,
model predictive control for autonomous racing,” IEEE Robot. Automat.
pp. 8026–8037.
Lett., vol. 4, no. 4, pp. 3363–3370, Oct. 2019.
[27] M. Abadi et al., “TensorFlow: Large-scale machine learning on heteroge-
[5] K. Y. Chee, T. Z. Jiahao, and M. A. Hsieh, “KNODE-MPC: A knowledge-
neous distributed systems,” 2016, arXiv:1603.04467.
based data-driven predictive control framework for aerial robots,” IEEE
[28] M. Diehl, H. Bock, J. P. Schlöder, R. Findeisen, Z. Nagy, and F. All-
Robot. Automat. Lett., vol. 7, no. 2, pp. 2819–2826, Apr. 2022. [Online].
göwer, “Real-time optimization and nonlinear model predictive control
Available: https://ieeexplore.ieee.org/document/9691797/
of processes governed by differential-algebraic equations,” J. Process
[6] N. A. Spielberg, M. Brown, and J. C. Gerdes, “Neural network model
Control, vol. 12, no. 4, pp. 577–585, 2002. [Online]. Available: https:
predictive motion control applied to automated driving with unknown
//linkinghub.elsevier.com/retrieve/pii/S0959152401000233
friction,” IEEE Trans. Control Syst. Technol., vol. 30, no. 5, pp. 1934–1945,
[29] J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl,
Sep. 2022. [Online]. Available: https://ieeexplore.ieee.org/document/
“CasADi: A software framework for nonlinear optimization and optimal
9638389/
control,” Math. Program. Computation, vol. 11, no. 1, pp. 1–36, 2019.
[7] G. Williams et al., “Information theoretic MPC for model-based reinforce-
[Online]. Available: http://link.springer.com/10.1007/s12532-018-0139-
ment learning,” in Proc. Int. Conf. Robot. Automat., 2017, pp. 1714–1721.
4
[Online]. Available: https://ieeexplore.ieee.org/document/7989202/
[30] R. Verschueren et al., “acados: A modular open-source framework for
[8] G. Shi et al., “Neural lander: Stable drone landing control using learned
fast embedded optimal control,” Math. Program. Comput., vol. 14, no. 1,
dynamics,” in Proc. Int. Conf. Robot. Automat., 2019, pp. 9784–9790.
pp. 147–183, 2021. [Online]. Available: http://arxiv.org/abs/1910.13753
[Online]. Available: http://arxiv.org/abs/1811.08027http://dx.doi.org/10.
[31] F. Ramzan et al., “A deep learning approach for automated diagnosis and
1109/ICRA.2019.8794351
multi-class classification of alzheimer’s disease stages using resting-state
[9] M. Faessler, A. Franchi, and D. Scaramuzza, “Differential flatness of
FMRI and residual neural networks,” J. Med. Syst., vol. 44, pp. 1–16, 2020.
quadrotor dynamics subject to rotor drag for accurate tracking of high-
[32] D. Falanga, P. Foehn, P. Lu, and D. Scaramuzza, “PAMPC: Perception-
speed trajectories,” IEEE Robot. Automat. Lett., vol. 3, no. 2, pp. 620–626,
aware model predictive control for quadrotors,” in Proc. IEEE/RSJ Int.
Apr. 2018.
Conf. Intell. Robots Syst., 2018, pp. 1–8. [Online]. Available: https://
[10] K. Chua, R. Calandra, R. McAllister, and S. Levine, “Deep reinforcement
ieeexplore.ieee.org/document/8593739/
learning in a handful of trials using probabilistic dynamics models,” in
[33] M. Kamel, T. Stastny, K. Alexis, and R. Siegwart, “Model predic-
Proc. Neural Inf. Process. Syst., pp. 4754–4765, 2018. [Online]. Available:
tive control for trajectory tracking of unmanned aerial vehicles us-
http://arxiv.org/abs/1805.12114
ing robot operating system,” in Proc. Robot Operating Syst., 2017,
[11] N. O. Lambert, D. S. Drew, J. Yaconelli, S. Levine, R. Calandra, and
pp. 3–39.
K. S. J. Pister, “Low-level control of a quadrotor with deep model-
[34] P. Foehn et al., “Agilicious: Open-source and open-hardware agile
based reinforcement learning,” IEEE Robot. Automat. Lett., vol. 4, no. 4,
quadrotor for vision-based flight,” Sci. Robot., vol. 7, no. 67, 2022,
pp. 4224–4230, Oct. 2019. [Online]. Available: https://ieeexplore.ieee.
Art. no.eabl6259.
org/document/8769882/
[12] E. Maddalena, C. da S. Moraes, G. Waltrich, and C. Jones, “A neural
network architecture to learn explicit MPC controllers from data,” IFAC-
PapersOnLine, vol. 53, no. 2, pp. 11362–11367, 2020. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S2405896320308442
Authorized licensed use limited to: University of Stellenbosch. Downloaded on March 30,2023 at 13:16:56 UTC from IEEE Xplore. Restrictions apply.