MPC in automated car
MPC in automated car
This article discusses a Model Predictive Controller (MPC) I built as part of Udacity’s
self-driving car nanodegree program (term 2). The project objective was to control a
vehicle in a simulator environment to drive as fast as possible without leaving the
drivable area. This work was done in the Spring of 2017 — and for more details, please
refer to my GitHub repository.
An inquisitive reader may wonder why we even need a controller if we already have a
reference trajectory coming from the path planner module. The reason is because the
reference trajectory (from the path-planner module) is of the form position(x,y) and
velocity. But we cannot directly influence them. The only “knobs” we are able to
directly influence are the throttle and steering angle of the vehicle. And so the job of
the controller is to adjust the throttle and steering angle such that the true position and
velocity of the vehicle is as close as possible to the reference position and velocity.
There are many different types of controllers ranging from simpler ones like
Proportional-Integral-Derivative (PID), Linear Quadratic Regulator (LQR), etc to more
sophisticated ones like MPC.
MPC is an optimal controller that is used when we have a model (available) of the
system being controlled. The goal of MPC is to minimize a predefined cost function
while satisfying constraints such as system dynamics, actuator limitations, etc. At each
time step, we calculate the best set of control actions that minimizes the cost function
over a specific time horizon and pick the action for the most immediate time step —
and the process repeats at the following time step. This is illustrated in the figure
below.
MPC Illustration (image obtained from the public domain of Wikipedia article on
MPC)
MPC Overview
As stated above, in MPC the objective is to minimize the cost function while satisfying
system dynamics and actuator constraints. This is mathematically represented as
follows:
MPC Objective (algorithm obtained from CS287 lecture notes [2])
Where x is the state, u is the control input/actuation, c is the cost function, and f is the
system dynamics’ model. There are additional constraints such as: (1) the initial state
(used for the optimization) is equal to the currently observed state and (2) actuation
constraints as we cannot arbitrarily control the actuator — although this is not shown in
the figure above.
While in the above figure the cost function and system constraints are computed
through the end of time T (i.e. at each time-step we are re-planning the entire
trajectory), this is not feasible for most practical situations as T can be very long.
Instead, a more practical approach at each time step is to re-plan the trajectory out to a
fixed time horizon H, so the cost summation and constraints go from k=t to k=t+H. For
this reason, MPC is also know as receding horizon control.
Although we will dive into each of the below in greater detail, at a high level, the
following steps are used for setting up an MPC controller — as per Udacity’s MPC
lecture notes [4]:
1. Depending upon the problem statement, set the horizon window by selecting the
appropriate trajectory length (N) and time-step duration (dt).
2. Using the reference trajectory way points (from the path planner), fit a polynomial
curve, which is then used to compute the cross track and orientation errors.
3. Using the problem statement again, select an appropriate vehicle dynamics model.
4. Determine any physical constraints that must be obeyed, e.g. actuator limitations
and other such limitations.
5. Using the cross track and orientation errors, create an appropriate cost function for
the MPC solver to minimize.
Upon building an appropriate vehicle dynamics model, cost function, and other
parameters as outlined in the above five steps, we can now execute an MPC controller
as follows:
1. Set the MPC solver’s initial state to the vehicle’s current state.
2. Run the optimization solver (Ipopt solver was used for this project). The solver
returns a vector of control inputs at each time step (over the horizon window) that
minimizes the cost function.
3. Apply the first time-step’s control/actuation signals to the vehicle and discard the
rest of the control inputs.
Having seen an overview of how an MPC controller works, let us now dive deeper into
each of the steps outlined above.
Given MPC is a constrained optimization problem at every time step, having a good
vehicle dynamics model is very important. But since the optimization problem is to be
solved at every time point, the model cannot be very complicated due to the associated
computational challenges. Hence, there is a tradeoff between model accuracy and
computational speed. For this project, the model used is the basic kinematics model of
the vehicle. It is a state-space based kinematics model with the state vector defined as s
= [x, y, ψ, v], whereby:
The simulator provides the state variables x, y, ψ, and v. The (x, y) position, which is in
the map coordinate system, needs to be transformed into the vehicle coordinate system
using translation followed by rotation.
Vehicle Actuators
The actuation signal is a vector [a, δ] where a is the throttle that controls
acceleration/braking (negative throttle corresponds to braking) and δ is the steering
angle that controls the vehicle orientation. These are computed by the MPC
optimizer/solver and then sent to the simulator. The actuation signal is bounded by the
following limits:
The state update equations are basic kinematic equations of motion used to set
constraints on the optimization problem such that the optimizer takes the vehicle
dynamics into account. They are first-order discrete-time difference-equations. The
following dynamics model is used to compute the next state given the current one:
x’ = x + v*cos(ψ)*dt
y’ = y + v*sin(ψ)*dt
v’ = v + a*dt
ψ’ = ψ + (v/L_f)*δ*dt
(equations by author)
While the IPopt solver used in this project can handle these nonlinear constraints, other
solvers may not be able to. In such cases where the MPC solver can’t handle nonlinear
constraint equations, then at each time-step, they can be linearized (using Taylor series
expansion) and re-written into the canonical state-space matrix form as s’ = As + Bu,
where s is the state vector, u is the control vector, A is the matrix describing natural
state transition, and B is the matrix describing control action’s effect on the state vector.
Additionally, if the state is not fully observable, then we have the observation equation
as o = Cs using the observation matrix C.
With that said however, in order for MPC to work well, we need to have full state
observability. While for this project the state is indeed fully observable (i.e. the matrix
C is just a 4x4 identity matrix), in most practical situations the state is not fully
observable. In such cases we need to use a Bayes filter based state estimator (e.g.
Kalman filter) to get the full state representation from partial (or noisy) state
observations.
B. Cost Function
An important component of the MPC is its cost/error function — because that is what
the optimizer minimizes. A well defined cost function is crucial for the optimizer to be
effective, thus it needs to be properly tuned based upon the problem statement. The cost
function used in this project is a quadratic function that is a weighted sum of the
following quadratic terms at each time step — that are summed across the horizon
window (which again is the number of time-steps we are forecasting into the future
for):
squared cross track error (cte) is the difference between the car’s position and the
reference trajectory
squared orientation error (ψ error i.e. epsi) is the difference between the car’s
orientation and the desired/target orientation
squared value of the throttle and steering actuation (we want to minimize this as it
influences the car’s energy consumption)
squared value of the rate of change of throttle and steering actuation (for smooth
driving)
cte and epsi are given large weights for accurate control. Furthermore, the rate of
change of steering angle and throttle are also given large weights for a smooth
trajectory.
The reference position used in computing cte is set to zero (i.e. corresponding to the
middle of the track). The reference orientation used in computing epsi is also set to zero
(i.e. the vehicle should be forward facing). The reference speed is set to 40 m/s. The
horizon window is represented as N in the below code snippet. These error terms are
multiplied by carefully chosen weights; and in the below code snippet, the cte weight is
shown as wt1, epsi weight is shown as wt2, etc. Properly tuning of these weights (by
experimenting with different values) is crucial for fast and smooth driving. Below is a
code snippet showing how the total cost fg[0] is calculated.
Cost Function (code by author)
In order to speedup the optimization process, the initial values of actuation (used by the
MPC solver) is the same as the most recent set of actuation signals actually applied to
the vehicle. This is much more realistic (and computationally faster) than random or
zero initialization.
C. Horizon Window
The frequency (dt) and timestep length (N) are two of the important parameters used
for tuning the MPC. Together they determine the horizon window over which
optimization is to be performed.
Frequency (dt)
Through experimentation, it was observed that a dt of 110msec or below caused the car
to oscillate about the center of the track —likely due to the fact that the actuator inputs
are applied at a much faster rate than the vehicle’s ability to respond and so it is not
able to keep up; and thus, it almost continuously tries to course correct itself.
On the contrary, using a large dt of 200msec and above, while the motion is smoother,
the car rolls-off the track edges around curves — because now the actuator inputs are
applied at a much slower rate compared to the rate of change of the road conditions
(which is dependent upon vehicle speed). Optimal performance was obtained with a dt
of about 145msec.
This determines the number of time-steps the MPC controller plans in the future for.
The higher this is, the further ahead the MPC controller plans for. But the downside of
large N is that it requires a lot more computations, which can lead to latency induced
inaccuracies as the solver may not be able to provide the solution in real-time.
Additionally, with a large N, any model inaccuracies will cause erroneous actuation
predictions for longer periods of time — hence compounding the effects of model
inaccuracies.
N was fine tuned after first finding the optimal value for dt. Once the dt value was set, a
few different values for N were tried. A high N of 30 caused the car motion to become
rather jerky and was computationally expensive. The optimal value of N seemed to be
around 10.
The path-planner module generates the reference (x,y) co-ordinates that are sent to the
vehicle controller. These co-ordinates are in the map space, so they are first
transformed to the vehicle’s co-ordinate system. Thereafter, a 3rd degree (cubic)
polynomial is used to fit these (x,y) waypoints . This polynomial is then used to
describe the reference trajectory; and thus, compute the cross track error (cte) at each
time step over the horizon window when performing optimization. For best
performance and robustness, the waypoints are not preprocessed; instead, a polynomial
curve is fitted in real-time.
E. Hardware Latency
To realistically account for real-world hardware delays, the simulator has a built-in
100ms latency. As a result, the throttle and steering angle control signals sent to the
simulator are actually applied to the vehicle after this delay. Unlike other controllers,
MPC controllers are generally much better at handling latency due to the following two
reasons:
1. We can explicitly account for such delays in the state-update equations before
passing the current vehicle state to the MPC solver. In other words, the initial state
used by the MPC solver is generated by predicting the measured state forward in
time (i.e. the latency time) using the kinematic model of the vehicle and assuming
the previous set of actuation holds over the latency period. This means that the
optimal control signals generated by the solver has already taken into consideration
the true vehicle state when the actuation signals are actually applied.
2. The frequency (dt) parameter can be set to be slightly larger than the latency —
this way the state update equations will implicitly take latency into account.
Together with these two approaches, the MPC controller can handle latency of even
hundreds of milli-seconds. Which is something very challenging (if not impossible) for
traditional controllers like PID, LQR, etc.
Conclusion
MPC optimizes the car’s trajectory based upon its current state by computing various
trajectories (i.e. steering angle and throttle actuations at each time-step across the
horizon window) and their corresponding costs. The trajectory with the lowest cost is
selected and the actuation vector corresponding to the first time-step is executed. And
the process repeats at the next time-step.
Hence, MPC is much better than traditional controllers because it optimizes the
vehicle’s actuation signals based upon its current state, its dynamics, the actuator
constraints, and the reference trajectory from the path-planner. As a result, I was able to
achieve a much higher drivable speed compared to the results of my PID controller or
behavioral cloning projects. The vehicle never left the drivable portion of the track even
while moving at a speed of over 80 MPH.
Future Experimentation
MPC + Path-Planner
Another thing I’d like to investigate in the future is to combine the path-planner with
the MPC controller. This is because both are solving a similar optimization problem. In
particular, the path-planner objective is to find the optimal (x,y) co-ordinates at
different time-steps while (1) driving at a particular speed, (2) being in the middle of
the track, and (3) avoiding obstacles. Whereas the MPC objective is to find the optimal
control actions at different time-steps while (1) following the reference trajectory
(generated by the path-planner) and (2) satisfying the vehicle dynamics and actuator
constraints.
Thay vì dùng điều khiển PID đơn giản, bạn có thể dùng MPC để:
Theo sát quỹ đạo tham chiếu do hệ định vị/lập bản đồ (SLAM) tạo ra.
Điều khiển vận tốc và hướng di chuyển (hoặc điều khiển tốc độ từng bánh) sao
cho robot đi theo đúng quỹ đạo đó.
Tối ưu chuyển động để tránh va chạm, tiết kiệm năng lượng, và giữ ổn định.
▶️Trong robot gắp rác, nếu SLAM tạo ra quỹ đạo cần đi theo để đến được thùng rác,
MPC sẽ đảm nhận vai trò tính toán các lệnh điều khiển thực tế (điều khiển bánh,
động cơ) sao cho robot đi chính xác trên quỹ đạo đó.
👉 MPC có thể tích hợp ràng buộc vật lý này trực tiếp vào bài toán tối ưu → từ đó
cho phép robot di chuyển an toàn hơn, hiệu quả hơn trong môi trường thực tế.
✅ 3. Điều hướng trong môi trường phức tạp với nhiều vật cản
✅ 4. Tích hợp dự đoán cho các tình huống thời gian thực
MPC nhìn trước được ảnh hưởng của lệnh điều khiển hiện tại lên tương lai.
Điều này giúp robot thực hiện hành động phòng ngừa (ví dụ: giảm tốc trước
khúc cua, tránh vật cản từ xa, v.v.)
PID đơn giản, dễ áp dụng nhưng khó mở rộng khi có nhiều biến động và
ràng buộc.
MPC tính toán tối ưu theo cả trạng thái hiện tại lẫn trạng thái tương lai.
Rất phù hợp cho robot di chuyển trong môi trường thật, đặc biệt là khi bạn
có một mô hình động học tốt và hệ định vị ổn định.
Học một mô hình (neural network) mô phỏng MPC offline → dùng inference
real-time thay vì tối ưu toán học tại mỗi bước → giảm chi phí tính toán.
Có thể dùng Deep Reinforcement Learning để học chính sách điều khiển thay
thế MPC khi hệ quá phức tạp để mô hình hóa chính xác.
📌 Kết luận
Điều khiển chính xác theo quỹ đạo SLAM tạo ra.
Tối ưu lệnh điều khiển theo trạng thái hiện tại và ràng buộc vật lý.
Tăng độ ổn định và khả năng hoạt động ngoài môi trường thực.
Kết hợp tốt với AI để nâng cao năng lực tự động hóa toàn hệ thống.
Nếu bạn đang xử lý định vị, lập bản đồ, phân loại rác, thì thêm MPC là một bước
nâng cấp mạnh về điều khiển thông minh.
https://www.mathworks.com/videos/series/understanding-model-
predictive-control.html