Formula 1 Grand Prix Simulator A Dynamic
Formula 1 Grand Prix Simulator A Dynamic
1
success. D
Here we model the decision problems at each lap as a n d
dynamic game between pilots which eventually results in a uA (a, n) , uD (a, n) uA (a, d) , uD (a, d)
a race simulation based on game-theoretically optimal ac- A
n uA (n, n) , uD (n, n) uA (n, d) , uD (n, d)
tions. The power of our architecture lays in its ability to
reproduce real-life races and its high compatibility with pre- Table 1: Normal form of the overtake game. An example of
race strategic settings: therefore, combining previous re- sensible numerical values in table 3. Notice that uA (a, d)
sults with our work we could achieve an operating and fully and uD (a, d) are expected utilities, while uA (n, n) =
characterized race simulator completely relying on Game uA (n, d); the same holds for D (when A chooses n, game
Theory optimal strategies. ends).
2. GT model
everything) but imperfect because the presence of Nature
The core of our implementation is the decision-making makes the final choice to be taken according to expecta-
module which we design on the basis of a competitive tions, and not certainty.
model and we solve following the rules of Game Theory. The extensive form of our game is depicted in figure 1:
We model the drivers as rational decision-makers and aim the numerical values of the payoffs at the end of each path
to describe the choices they face in a mathematical way, so are fine-tuned like the other parameters of the implemen-
for them to decide according to their rationality. tation and can be changed to study different behaviours of
Consider the following frame of the race: a fast driver players. However, we use the following rationale to define
is approaching the driver in front of him and has to decide them: if an overtake happens with no duel ((a,n)), the pay-
either to attack him to perform an overtake or to wait. In offs will be x and −x for A and D respectively; if an over-
a real-life scenario he has to evaluate in a fraction of sec- take happens after a duel ((a,d)) the previous payoffs will
onds many different variables: the velocity of his car and be lowered by −0.5 to model the tyre wear caused by the
the one of the opponent, the personal skill-set of both play- fight; if A and D fight but there is no overtake, the pay-
ers, the probability of the attempted overtake to be success- offs will have to model only the tyre wear (−0.5 for both
ful and the eventuality of a car crash due to the craziness drivers); crash has to be the worst scenario, in which both
or their driving style. On the other hand, if the back driver the drivers obtain the lowest utility; if A does not attack,
attacks, the opponent can make a choice himself: either to nothing happens, so the payoffs will be both equal to 0.
try to defend himself (trying to keep the position but con- An example of the payoffs obtained using this rationale is
suming more his tyres by battling) or leaving the attacker showed in Table 3. Of course, assuming players to be ra-
overtake for free (if he knows he is much slower than the tional allows us to set the Nash Equilibrium (NE) of the
attacker). All the variables involved in the decisions could game (which is also a Sub-game Perfect Equilibrium, SPE)
be in principle lap-dependent (for example, after one lap ve- as the realization of what happens in the in-race situation
locities will be different due to tyre consume and therefore we are investigating (i.e., we make drivers follow the NE
this evaluation may lead to a different outcome). in the simulation). Starting from figure 1 we can derive the
Here we take everything into account by properly pa- normal form of the game: this is reported in table 1.
rameterizing the payoff of each driver in each situation so As mentioned, N is an ”hidden player” of the game
to enter the game-theoretical framework. While doing so, which is responsible for the final outcome but has no strate-
we design a dynamic game of complete imperfect informa- gic interests in the result. Its practical role is the one of ran-
tion of two players: the back driver (Attacker, A), which domly deciding the result of the duel. In order to do so, it
plays first and can take actions ”attack” (a) or ”not to at- is necessary to accurately model the probabilities linked to
tack” (n), and the driver in front (Defender, D), who does each outcome: these will depend on drivers’ parameters at
not have to make a choice if A does not attack but decides the i−th lap. We define the successful overtake probability
among ”defend” (d) and ”not to defend” (n) when A at- as:
tacks. If the joint strategy of the two players is (a, d), then
v (i) − v (i) (i)
s − sD
(i)
a battle arises: in this case we model the outcome of the
(i) (i)
α = α θA , θD = A D
+ A (1)
attempted overtake as a ”move by Nature”, i.e. the decision vmax smax
of an external player (Nature, N) according to which the at- While the crashing probability as:
❆
tempt can either be successful (✓), failed (✗) or resulting in
a crash ( ). Here the information is complete because each c(i) · c(i) (i)
s ·s
(i)
(i) (i)
player has full knowledge of the game status (i.e., everyone γ = γ θA , θD = A 2 D − A 2 D (2)
cmax smax
knows all the past, the sequence of actions, the payoffs, the
(i)
probabilities of outcomes and knows that everybody knows where θ A is the set of A’s parameters at the i−th lap
2
Figure 1: Overtake game for Attacker, Defender and Nature
(i)
(vA is A’s velocity at the i−th lap, s its skill level and c The game and solution we described above applies for
its craziness), and same applies to D. Namely, the probabil- an arbitrary pair of drivers where a battle for the position is
ity of A to overtake D is large when A has higher velocity possible. In the following paragraphs we describe how we
and/or skills than D, while the probability for A and D to managed to create a fully working simulator of a Formula
crash depends on the craziness and the ability of the two One race by employing this GT model in the realization of
drivers. Note that the former changes for each lap, since the duels. Moreover, building a simulation has the benefit of
velocity of each driver depends on other factors of the race naturally varying the parametric conditions of the game, al-
(such as tyre age), while the second is determined at the lowing us to investigate the solutions of hundreds of games
start. In our implementation the parameters are fine-tuned with the same extensive form but different expected payoffs.
so to have α ∈ [10% : 90%] and γ ∈ [5% : 40%].
The last tool we need is a way to find the NE. This can 3. Dataset
be done by backward induction: starting from the right hand
We decide to run simulations for the the official 20
side of figure 1 (D moves second), we focus on the yellow
drivers of Formula One season 2021, which we describe by
node and take as D’s choice the one which maximizes his
the means of different parameters crawled from 2 debate
payoff. Please notice that while the payoffs for the path
portals (Corriere dello Sport [3], SportSkeeda [4]). The pa-
(a,n) are well defined, the ones for the path (a,d) are to
rameters which used in the model are:
be taken as the expected payoffs over all possible Nature’s
choices with their probabilities, meaning: • Maximum velocity [v]: car speed in range (4,10)
uA (a, d) = (1 − α − γ) · uA (a, d; ✗) • Skill [s]: the skill of the driver in the range (1,5)
❆
+ α · uA (a, d; ✓) + γ · uA a, d; (3) • Craziness [c]: the driver’s nature to take risks in the
range (1,7)
where for example uA (a, d; ✓) is the payoff of A for the
• Tyre Age [ty]: the condition of the tyre (all drivers be-
strategy (a,d) when the overtake is successful. Similar equa-
gin the race with ty = 100%)
tions can be written for the Defender.
The choice of D cancels out one of the two paths on the Moreover, each driver is identified with a three-letters code
bottom yellow node. At this point, we proceed with back- and a starting position in the grid. The attributes were mod-
ward induction and make A take the action that maximizes eled by us based on the data referring to the the last F1 sea-
his payoff. This way we find the SPE of the game. When sons: ”Velocity” and ”Skill” parameters are set in relation
equal payoffs arise, we assume each player to be generous to the constructors’ ranking and drivers’ final rankings (re-
(i.e., backward induction returns the path that maximizes ported in [3]). Drivers belonging to the same team are driv-
the opponent’s payoff). ing the same car and thus have the same max speed. For
3
what concerns the values of the ”Craziness” parameter, we 1. Equilibrium (n,n) (equivalent to (n,d)): the optimal
set them in the range from 1 to 7 according the ranking of choice for the attacker is not to attack, and therefore
damages (in thousands dollars) each driver caused during the game comes to end. This is the case of highly im-
the season (available in [4]). probable overtake or highly probable crash. Here the
defender will simply conserve its position and while
4. Simulator model and implementation updating the total time of the drivers we make sure
that the attacker keeps a total time slightly above the
From a starting grid specified by the user, our implemen-
one of the defender. The attacker loses the possibility
tation performs an iterative procedure to update the status of
to perform other attacks in this lap.
drivers at each lap, which takes into account several real-life
details of a Grand Prix. This is composed by some funda- 2. Equilibrium (a,n): for the defender is inconvenient to
mental modules: defend, since either the attacker is highly favoured or
the crash probability is high. The attacker has there-
• A GT-based system to determine the optimal choices fore the chance of a free overtake, which we model as
of the drivers; a swap in the absolute time of attacker and defender:
• An overtake actuator that performs actual overtakes on A has now the lowest absolute time and therefore an
the basis of drivers’ choices; higher ranking. Performing overtake costs a certain
battle
tyre wear tywear to A, while for D nothing happens
• A inter-lap routine which updates all relevant parame- since he did not fight. If there are other drivers in the
ters at each lap (e.g. velocity, tyre age, etc). attacking time window of A, he can continue to attack
(GT module called again).
We review them in better details.
3. Equilibrium (a,d): both attacker and defender have
GT-based overtake module good reasons to fight, since expected payoffs are con-
venient for both. The result of the outcome is decided
This module simply actuates what has been described in by Nature (via random variable generation), while both
section 2: given a pair of drivers (A,D), we compute the players degrade their tyres because of fight. If the over-
values of α and γ for the specific game and apply backward take is successful, we swap A and D’s total times and A
induction to retrieve the optimal choices of the players. can continue to attack; if it fails, we proceed as for the
(n,n) scenario. Finally, if Nature decides for a crash,
Overtake system and actuator both players receive a (considerate) time penalty tcrash
(yet, we decided not to make them retire not to change
While investigating real-world F1 races, one typical situa-
the total number of drivers).
tion that drivers face is the one in which a potentially faster
car is ”covered” (and so slowed down) by an opponent.
Moreover, overtakes are possible only when attacker and Update to the next lap
defender are close enough, in a process which is enhanced After all possible overtakes have been considered and all
by what’s technically called ”DRS time window”. For the players have made their choices on the basis of Game The-
sake of simplicity we could also state that in an overtake ory techniques, we update the status quo of the race by con-
process the time gain of the attacker is equal to the time sidering the evolution of all the parameters subject to dy-
loss of the defender, while it is well known that when two namical changes among time, namely:
drivers battle they stress their tyres more than when they are
running free. • Augment all tyres age: at each lap the degradation of
std
Inspired by all this, we model the overtake procedure as tyres is given by a fixed parameter tywear divided by
follows: we set a time window ∆tot which defines the max- the skill of each driver (i.e, that for all players the fixed
imum absolute time difference between drivers for an over- degradation per lap is a value in the range [1% : 5%]).
take to be possible (i.e, for each couple of drivers D1 , D2 This represents the huge role that drivers’ abilities play
with T2 > T1 we have that D2 can attack D1 if and only if in the tyre management in actual races.
T2 − T1 < ∆tot ) and we launch the GT module described • Check tyres age: we control which drivers have tyres
above for each couple of drivers satisfying this constrain min
which are too ruined (under a threshold tylif e . If tyres
(with particular attention to the case in which more than
are too degraded, a pitstop is forced to collect new
two drivers lay inside the time window, see below).
ones.
On the basis of the outcome of the previous module,
three scenarios can arise for the attacker-defender pair, • Identify undercutters: it is not rare that in real situa-
namely the three paths of figure 1: tions a driver who’s facing difficulties in overtaking an
4
opponent despite being faster decides to anticipate the This correctly represents what in reality happens and means
pitstop to have higher velocity and free path, waiting that rational drivers which are aware of their superior abil-
for the opponent to have his pitstop later and eventu- ities decide to attack and gain positions: this confirms the
ally fall behind (”undercut overtake”). In our simula- correct working of our model.
tor, there could be situations for which it is convenient
for a driver to attack but the probability of success is 5.1. Simulations
still small and Nature chooses multiple times to deny Figure 2 reports the evolution of a Gran Prix simulation
uc
the overtake. Therefore, we set a parameter fmax on with 300 laps (parameters reported in table 3). One can
the maximum number of attacks a driver can fail before observe:
deciding to perform undercut (and therefore to enter
pitstop). • Left side: drivers in their initial positions;
• Perform pitstops: for players willing to pitstop we reset • Center: driver’s position, represented by lines of dif-
the 100% tyre life at the price of a slight time penalty ferent colors, that can changes for each lap because of
tps . Moreover, to add randomness to our implementa- any overtake, crash or pitstop;
tion, we account mechanics to make a mistake with a • Right side: drivers in their final positions.
little probability pps
err in a way that penalizes the driver
more (with a time terr
ps > tps ).
5
and si , which are higher than the ones of other drivers. Fig- Grand Prix goes to balance itself, with a behaviour similar
uratively, this means that ”faster pilots run a race on their to the one of figure 2. This makes us suspect the existence
own”, competing just among themselves: this is a quite of a sort of ”dynamical equilibrium” configuration (more on
common phenomenon in modern F1. Moreover, the top po- this in the following sections).
sitions of the final ranking (on the right of Figure 2) are very
similar to those represented in the starting grid: again, this 5.2. Parameters settings
strengthens the thesis expressed previously.
Table 3 reports the parameters settings for the simula-
We run further analyses by keeping the same driver pa-
tions above.
rameters but reversing their starting positions and placing
best drivers last (”descending” order, DESC). In this way Parameter Value (a.u.) Parameter Value (a.u.)
we want to test the simulator with a situation of full dis- ∆tot 2.0 tcrash 7.0
advantage for the drivers with high vi and si . Results are L 10.0 tps 3.0
reported in figure 3, while a focus on the starting part of the psc 1% terr
ps 7.0
race is displayed in 4 vi [4.0 : 10.0] ppserr 5%
min
si [1.0 : 5.0] tylif e 30%
std
ci [1.0 : 7.0] tywear 5%/si
battle
α [10% : 90%] tywear 2%
uc
γ [2% : 40%] fmax 3
Payoff Value (a.u.)
uA (n, n) , uD (n, n) 0.0; 0.0
uA (a, n) , uD (a, n) 3.0; −3.0
uA (a, d; ✗) , uD (a, d; ✗) −0.5; −0.5
uA (a, d; ✓)
, uD (a,
d; ✓) 2.5; −3.5
uA a, d; ❆ , uD a, d; ❆ −7.0; −7.0
6
Absolute time evolution of the race
1200 DRIVER
MAZ
MSC
1000 GIO
LAT
RAI
RUS
800 TSU
Absolute time (a.u.)
STR
VET
600 OCO
ALO
GAS
400 RIC
SAI
LEC
200 NOR
PER
BOT
0 HAM
VER
0 50 100 150 200 250
Lap number
Figure 5: Total time of the race per driver (first 250 laps, Figure 6: CPU time by varying ranking order and drivers
ASC configuration) number
7
Number of duels for 20 drivers Number of duels for 40 drivers
20000 35
5000 12
17500
Figure 7: Number of duels for different starting conditions for 20 drivers (a) and 40 drivers(b), corresponding to the number
of calls of the GT decision module. Continuous lines represent the cumulative amount, dotted lines the average per lap.
that they will always be within the overtake window and we 6. Conclusions
will have to run the GT decision module for this pair at each
lap. However, if the fastest drivers are in the back it can hap- In our work we managed to design and implement a
pen that while they overtake all the slowest ones they create fully working simulator of a Formula One Grand Prix based
a gap between pairs of slow twins which is eventually im- on game-theoretically optimal joint strategies for rational
possible to overcome, due to the limited time recovery they drivers. We successfully modeled several in-race deci-
can make. As a result, fewer pairs of twins battle at each lap sion variables and took into consideration real-life dynam-
and the overall number of operations grows slower. ics coming from the study of experimental races. The be-
haviour of players follows the one we expect: skilled drivers
In all cases we see how the number of operations per lap
are conscious of their abilities and gain the top of the race
is high at the beginning (race assessment transient) and de-
by choosing to attack, despite of initial conditions.
creases down to a steady number for large enough iterations.
Therefore, our work proposes an innovative tool either
This is in accordance with the observations above.
to analyse race prospects for real-life applications or to
5.5. Computational analysis deepen the study of game-theoretical models in an automa-
tized environment. In particular, further investigation could
Some final considerations can be done regarding the reinforce the model by including other race variables (e.g.,
computational time of the simulator, where the machine we weather conditions) and combining the decision model we
use is a MACBOOK PRO (Processor: Intel® Core™ i5 proposed with some pre-game strategy study (such as tyre
dual-core 2,3 GHz; Memory: 8GB 2133 MHz LPDDR3). types, pre-arranged pitstop schedule,...). Other useful in-
We report four simulations with 1000 laps achieved by sights could be achieved by considering longer-term pay-
varying the drivers number (20 or 40, as in the previous offs (e.g., championship points and a zero-sum games), re-
section) and the starting grid (ASC or DESC configura- defining the overtake and crash probabilities or extending
tion). Results are represented in Figure 6. Analyzing the 20 the horizon of the game to the n− players case.
drivers case, it is easy to see that for a number of laps n ≫ 1
the trends display the same tendency, meaning that chang- References
ing the initial positions of the drivers we are not affecting
[1] Formula 1 strategy and Nash equilibrium . Accessed: 2021-
the computation times. Since we register an higher number 11-22.
of overtakes with the DESC configuration, this result tells [2] F1 teams facing ”game theory” strategy battle in Turkish GP.
us that the whole overtake module we implemented has no Accessed: 2021-11-22.
significant impact on the CPU time. In the 40 drivers case [3] F1: le classifiche piloti e costruttori dopo il Gp del Qatar. Ac-
the situation is different: the 2 lines have the same trend cessed: 2021-11-21.
until around lap 800 and from this point on the computa- [4] Which F1 driver accrued highest crash damage costs in 2021
tion times change slightly. Generally speaking, the trends season? Accessed: 2021-11-22.
of the computational times remark the ones of the complex- [5] The Game Theory of Formula 1: Winning the Monaco Grand
ity analysis of section 5.4 (figure 7). Prix. Accessed: 2021-11-22.