0% found this document useful (0 votes)
16 views

stageM2-Mean Field Game Modeling

This document presents research using mean field game theory to model epidemic propagation when individuals have control over certain parameters. It discusses two models: 1) A SIR model where individuals choose vaccination rates to minimize costs, and 2) A SIR model where individuals control social contact rates in different settings and age groups. Numerical methods are developed to find the Nash equilibria for these models. A genetic algorithm is also proposed as an alternative to solving the associated Bellman equations when they become too complex.

Uploaded by

Ahmed Talbi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

stageM2-Mean Field Game Modeling

This document presents research using mean field game theory to model epidemic propagation when individuals have control over certain parameters. It discusses two models: 1) A SIR model where individuals choose vaccination rates to minimize costs, and 2) A SIR model where individuals control social contact rates in different settings and age groups. Numerical methods are developed to find the Nash equilibria for these models. A genetic algorithm is also proposed as an alternative to solving the associated Bellman equations when they become too complex.

Uploaded by

Ahmed Talbi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Université Paris-Saclay

Laboratoire de physique théorique et modèles statistiques


d’Orsay
LPTMS

Mean Field Game modeling of epidemic


propagation

Author :
Supervisor : Louis Brémaud
Denis Ullmo M2 ICFP theoretical physics
2020 - 2021

Supervising Teacher : Internship :


Frédéric Restagno April 05 - June 25, 2021
Abstract
We consider the propagation of epidemics for which individuals have control on some parameters like their
vaccination rate or social interactions. We present general properties of discrete Mean Field Games which
are used to describe the behavior of individuals. Then we focus on two models deriving from the standard
compartmental SIR model used in epidemiology and discuss their Mean Field Game version. The first one
is a SIR model with a vaccination rate. Individuals have to choose their vaccination rate depending of the
epidemic situation in order to minimize a certain cost. If individuals choose to be vaccinate too early, then
the epidemic does not take off, but the cost will be high because too many people are uselessly vaccinated.
On the other hand, if they do not vaccinate themselves, the epidemic grows and the cost due to risk of
infection will be high too. A Mean Field Nash equilibrium at the population level is formed, and we compute
this equilibrium numerically. The second one is a SIR model with a structure of social contacts : there are
some settings where individuals have contacts and could be contaminated. Individuals are grouped by age
and they control their rate of contacts in each setting in order to minimize a certain cost. We implement the
Mean Field Game paradigm on this model and we find numerically the Nash equilibrium. Furthermore, we
develop a genetic algorithm which provides an alternative route to the optimization process when the usual
approach, through the Bellman equation, is not practicable.
Table des matières
1 Introduction 3

2 Discrete Mean Field Games 4


2.1 General framework of discrete Mean Field Games . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 The Hamilton-Jacobi Bellman equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Mean field Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Numerical methods to find the Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 SIR model with a vaccination campaign [Following [2]] 7


3.1 Theoretical framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Numerical approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Genetic algorithms 12
4.1 Theoretical framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Application to the SIR model with vaccination . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5 SIR model with a social structure 15


5.1 Theoretical framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.2 Mean Field Game approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.3 Practical implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.4 Numerical simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.5 Other method to reach the Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

6 Conclusion 23

2
1 Introduction
Since many years and in particular since the beginning of the COVID-19 pandemic, the question of mo-
deling as precisely as possible epidemics propagation is central. The most famous model used since decades is
the compartmental SIR (susceptible - infected - recovery) model. Here, we make an homogeneous assumption
at the society level, that is we say that all individuals have the same probability of contact with any other
individual. The model is described by the following set of equations :

Ṡ = −β̂S(t)I(t)

I˙ = (β̂S(t) − γ̂)I(t) , (1.1)

Ṙ = γ̂I(t)

where S, I and R are respectively the proportion of susceptible, infected and recovery people in our society. β̂
is the transmission rate of the disease considered and γ̂ the recovery rate. The SIR model is very simple and
has a lot of variations to gain in precision. The most common ones are the SIRD model (D for deceased, [11]),
SIRV (V for vaccination, [12]), MSIR (M for maternally derived immunity, [8], [9]), or SIRC (C for carrier
but asymptomatic, [10]) and SEIR models (E for exposed class, [4]) but there are a lot of other variations,
see [8] and [9] for a more detailed literature on the subject of compartmental models.

In a lot of models, the two parameters β̂ and γ̂ are fixed but it is well known that at least the transmission
rate β̂ change a lot with time. Indeed, there are a fix part in β̂ intrinsic to the disease and a changing part
which depends of human interactions (which evolve with time, see for example the different lockdown periods
in almost all countries in the world during the covid-19 pandemic). Thus, we can assume a time dependence
β̂(t). The variations of β̂ and thus of human interactions (frequency and number of contacts) are unknown
a priori. This parameter is extrinsic to the model, we need some guess about it, but this guess has to be
time dependent which makes extremely difficult to extract β̂(t) from experimental data as it involves the
evolution of humans behaviors at society level will evolve in the future. As an example, we can see that the
Institut-Pasteur in France, responsible for advising policy decisions during the COVID-19 epidemic crisis
make a lot of predictions with several possible evolution for the reproduction number R0 and the vaccination
rate because they do not model the behavior of people (see their work in [13]). Instead of this, they take an
empirical approach like in [14] to estimate the impact of restrictive measures on the transmission rate β̂ and
use it afterwards in their predictions.

We propose here to take another approach, and to turn extrinsic functional parameters of the model into
intrinsic ones (e.g β̂ with the simple SIR but there may be other parameters for variations of the SIR). That
is we want to develop a theoretical apparatus providing a prediction for these functional parameters. In the
aim of modeling humans behavior, we will use the “Mean Field Game” paradigm. Introduced by Lasry and
Lions in [15], [16] and [17] and independently by M.Huang, R.P. Malhamé and P. E. Caines [18]. MFG focus
on the derivation of a Nash equilibrium within a population containing a larger number of individuals. The
classical way to write down the dynamic of these games is through a (coupled) system of Kolmogorov and
Bellamn equations (see [19] or [1] for a complete mathematical description). They are also used by theoretical
physicists, for example in the case of crowd dynamics (see [20] for an introduction to the continuous games
with a physicist point of view).

Our final goal is to develop refined SIR models with a description of the society at a mesoscopic level,
that is a level where there are heterogeneous groups but still far from the individual description (to make a
homogeneous assumption at this level rather than at society level). Then, the idea is to obtain the functional
parameters by a MFG approach. In order to cope with increase complexity our models, we will need generic
methods to solve our Mean Field Games because the associated Bellman equations with the refined models
might be very complicated to solve and even to derive. This is why we will develop genetic algorithms which
can find numerically the solution of the Mean Field Game without resorting to the Bellman equation.

The question of control of epidemics is often modeled through combination of isolation and vaccination
strategies. That is the number of contacts of individuals and the vaccination are the main levers to impact
epidemic evolution. Some work has already been done from a global point of view, for example [21] and [22]
but they do not explore the individual behavior. Some other papers tries to evaluate the best individual
response without the Mean Field Game paradigm ([23], [24]).

In our work, we keep the two main approaches to control epidemics, that are isolation and vaccination

3
strategies and we will study them within MFG. This report is organize as follow : after an introduction to
discrete Mean Field Games which relies on the paper [1], we study a first paper of Turinici [2] which uses
Mean Field Games in order to determine a vaccination campaign. We present the main ideas, the physical
interpretations of the results of [2] and we implement the numerical part of the paper (which is mainly focus
on mathematical properties). Then, we introduce genetic algorithms with a description based mainly on the
book on evolutionary computing [3] and we apply it in order to recover the results of the paper [2] without the
Bellman derivation. In a fourth part, we introduce a SIR model with a structure of social contacts proposed
by [4] and [25] to get a more detailed description of the society, at a mesoscopic scale. We include the Mean
Field Game paradigm on this scheme following the spirit the paper of Turinici [5] who did it for a classical
SIR model. We make numerical simulations in order to understand the behavior of our model and we find
the Nash equilibrium with two different methods (inspired by [6] and [7]). Finally, we conclude our work by
proposing some possible enhancement to our model and some openings to other models.

2 Discrete Mean Field Games


In this section, we provide a general description of discrete Mean Field Games, based mainly on the paper
[1]. Reader can refer to this paper for a rigorous mathematical approach. Symbol ≡ in an equation means
“by definition” of the left or right hand side in the following.

2.1 General framework of discrete Mean Field Games


Mean Field Games (MFG) are part of game theory in mathematics. We consider a game during a certain
(continuous) time T with a large number of players (or agents) N . At a certain time t, each player is comple-
tely described by his state which could be continuous (for instance in the case of crowd dynamics, with his
position ~x, see [20]) or discrete (e.g in a case of a limited number of choices for the players, i). We will deal
only with discrete games in the following. We have a finite number of states a = 1, ..., d and P we denote by
θa (t) = kNa the proportion of players in the state a at time t (there is ka players in a here) with : a θa (t) = 1.
In our SIR model above, the N players are the individuals of our population, T the characteristic time at
the end of the epidemics and the discrete states are the different compartments : susceptible, infected and
recovery. The associated proportions are respectively S, I and R (with S + I + R = 1). In the following, we
will also denote by S, I and R the different compartments if there is no ambiguity. The players have control
over some parameters α(t) in order to modify their own state, but only at a probabilistic level. Indeed, α
takes the form of a transition rate between states (Markov process). In the example above, the control para-
meter could be the transmission rate β̂(t) (as it is done in [5]) and thus susceptible individuals (at t) control
their probability to be infected at t+dt which is β̂(t)·I(t)·dt but have not directly control over their next state.

We consider that all players are equivalent (symmetric assumption), therefore the proportion of players
in each state θi is sufficient to describe the game and we can consider a reference player to study it. But this
problem is still very hard to solve because we have N players in interaction (many body game theory). In
order to turn the problem into a solvable one, we can make the following mean field approximation : since
the number of players is very large, we consider that players are only sensitive to θ̄i (t) = hθi (t)i (stochastic
average), that is we consider that N is sufficiently large to neglect fluctuations around θ̄i (t). Now we can deal
with θ̄i (t) which is a deterministic quantity. Actually, this mean field approximation was already done in the
SIR model (1.1) ; we wrote indeed S, I and R as deterministic
√ quantities, with a well known variation at
each step. In reality, there are stochastic fluctuations in N for these quantities but we wrote directly S̄, I,
¯ R̄.

Since all players are equivalent, we denote by β(t) the strategy of all other players (we call strategy a
certain choice of control parameters during the game), β(t) is a matrix of transition rates. We stress that β is
a strategy and not the transmission rate β̂ of the SIR model. The problem is said to be consistent (this will
be referred to below as a “Nash equilibrium”) if the best strategy α∗ for the reference player is β : α∗ (β) = β
because by symmetry this strategy β will be the best for each player. This is an equilibrium point in the
sense that nobody has an interest in changing his strategy if everybody else keep the strategy β (which is
the general definition of a (symmetric) Nash equilibrium).

Since the problem is described by a Markov process with a continuous time, we naturally obtain the
dynamical equation of θi , also known as the Kolmogorov 1 equation (we write θi (t) instead of θ̄i (t) for
1. In reference to continuous games where we deal with the original Kolmogorov equation

4
simplicity)
dθi (t) X j
= θ (t)βji (t) , (2.1)
dt j

with θ(0) = θ0 . βij ≥ 0 is the transition rate from i to j and θ is the vector of all the θi . Note that for the SIR
model, (2.1) correspond to (1.1) and we can directly read on it what are the rates βij (t) in this case P (it could
be that only some rates βij (t) are controlled by players). Remark : by convention, βii (t) ≡ − j6=i βij (t)
(that is the rate to stay in i is the opposite of the total rate to leave i).
During the game, each player pay a certain instantaneous cost c which depends of his own state but also
on the state of all other players. For a given player, the total cost incurred is the integral over time of his
instantaneous cost (plus eventually a final cost) for a certain strategy α. Denoting by it the state of our
reference player at t and assuming that he knows θ(s) for all s ≥ t, the total cost paid between t and T
has the following general form starting with it = i and with θ given (this is also the case in the following
equations) : "Z #
T
uiθ (t, α) = Eα
it =i c(is , θ(s), α(s))ds + ΨiT (θ(T )) . (2.2)
t

The expectation here is on is which is the only undetermined quantity (the subscript α is here to say :
“expectation for a given strategy α”). The goal for each player is to minimize their own total cost. In the
SIR model, a strategy is what is chosen for β̂(t) at each t. We can imagine a cost where we pay a fix price
rI at t if we are infected by the disease at t and a continuous price at each time t if we choose to reduce
our social contacts β̂(t). Thus, the cost depends of the risk of infection and therefore of the situation of all
other individuals (how many infected are there ?). Naturally, all individuals want to minimize their total cost
during the epidemic. To minimize this function uiθ (t, α)), each player choose the best strategy α∗ to obtain
the value function : the minimum price (on average) paid by a player between t and the end of the game T :

uiθ (t) = min uiθ (t, α) = uiθ (t, α∗ ) . (2.3)



α

The goal of the problem is to find α∗ (t) (i.e the best strategy of players) and θ(t) (the evolution of
proportion of players in each state).

2.2 The Hamilton-Jacobi Bellman equation


In order to get the best strategy α∗ we derive “physically” the equation followed by the value function
(see [1] for a rigorous derivation). By definition of the Markov process, we have (at first order with
uiθ (t)
dt −→ 0) :

P (it+dt = j|it = i) = αj (t) · dt (j 6= i) (2.4)


X
P (it+dt = i|it = i) = 1 − αj (t) · dt . (2.5)
j6=i

We use an argument “à la Bellman” , writing uiθ (t) in terms of uiθ (t+dt) (see (2.2) and (2.3) for the definitions)
h i
i
uiθ (t) = min Eit =i uθt+dt (t + dt) + c(i, θ(t), α(t)) · dt . (2.6)
α(t)∈Rd

Then, if we write explicitly E with the probabilities of the Markov process :


   
X X
uiθ (t) = min  αj (t) · dt · ujθ (t + dt) + 1 − αj (t) · dt · uiθ (t + dt) + c(i, θ(t), α(t)) · dt , (2.7)
α(t)∈Rd
j6=i j6=i

where the first term corresponds to it+dt = j for some j and the second term to it+dt = i. Then, since
uiθ (t + dt) is independent of α(t), this gives the Hamilton-Jacobi Bellman equation :

duiθ
− = min [c(i, θ(t), α(t)) + α(t) · ∆i uθ (t)] , (2.8)
dt α(t)∈Rd

where we define the difference operator on i as : ∆i u ≡ (u1 − ui , ..., uj − ui , ..., ud − ui ) whith u ∈ Rd .


Moreover,
α∗ ≡ argmin [c(i, θ, α(t)) + α(t) · ∆i uθ (t)] (2.9)
α(t)∈Rd

5
is the optimal Markovian control. Remark : below the Hamilton-Jacobi-Bellman 2 equation is sometimes
called (for simplicity) the Bellman equation.

2.3 Mean field Nash equilibrium


We obtain a Nash equilibrium when the best strategy of the reference player is β itself, that is at each
time s, for all j, i
βji (s) = αi∗ (∆j u(s), θ(s), j) . (2.10)
The Nash equilibrium is therefore characterized by the system of Kolmogorov and Hamilton-Jacobi-Bellman
equations (replacing βji by (2.10))
 i
dθ X
 dt = θj αi∗ (∆j u, θ, j)



j (2.11)
 dui
− = min [c(i, θ(t), α(t)) + α(t) · ∆i uθ (t)] ,


dt α(t)∈Rd

together with the initial-terminal conditions

θ(0) = θ0 ui (T ) = ΨiT (θ(T )) . (2.12)

This form the initial-terminal value problem (ITVP) for the Mean Field Game. The solution of this problem
is the solution of the MFG. Remark : in general, we have some final cost ΨiT (θ(T )) but in the case of epidemic
propagation we will take ΨiT (θ(T )) = 0 in the following. Under some monotonicity assumptions on the cost,
authors of [1] shows that there exists a unique solution (θ,u) to ((2.11) - (2.12)), that is there is a unique
Nash equilibrium.

2.4 Numerical methods to find the Nash equilibrium


We describe here two numerical methods we have implemented to find the Nash equilibrium.

1st method : Inductive sequence


The Nash equilibrium correspond to a fixed point of the function fθ0 : β −→ α∗ (β global strategy, α∗ the best
individual strategy, and θ0 the initial conditions). The idea is to use an inductive sequence Un+1 = fθ0 (Un )
to reach this point.

To compute fθ0 (Un ), we apply the following general iterative scheme, with Un = βn a global strategy
(and our inductive sequence)

βn −→ θ(βn ) −→ f˜θ (βn ) = αn∗ −→ βn+1 = αn∗ = fθ0 (βn ) . (2.13)


Kolmogorov Bellman Symmetric

On the first step, we start from a global strategy βn and we compute numerically θ(βn ) using (2.1). Then, we
solve numerically or analytically backwards in time the Bellman equation (2.8) to find αn∗ the best individual
strategy in response to βn (this correspond to f˜). Finally by a symmetric assumption on players, each player
choose the same best strategy, therefore the new global strategy is simply βn+1 = αn∗ . We continue this
scheme until the convergence of (βn ). If this sequence converges, we reach indeed a fixed point : fθ0 (β) = β.

Picard-Banach fixed point theorem provides us with sufficient conditions for convergence : it states that
every contractive mapping on a complete metric space (the metric is denoted d) has a unique fixed point (an
application is said to be contractive on E if there exists 0 < k < 1 such that : ∀(x, y) ∈ E, d(f (x), f (y)) ≤
k · d(x, y)). The theorem gives also that every inductive sequence of the above form converges to the fixed
point geometrically : d(Un , l) ≤ k n · d(U0 , l) (see [6]). Advantages of this method are that the convergence
is fast (geometrical) and sure (we reach the Nash equilibrium certainly). But in order to use it, we have to
be able to compute fθ0 (β) analytically or numerically. Furthermore, if fθ0 is not contracting, (Un ) will not
always converge.

If (βn ) does not converges, one possibility is to rather use βn+1 = a · fθ0 (βn ) + (1 − a) · βn with a ∈ [0, 1]
which is a more robust sequence convergence cannot be guaranteed here either, even with small values of a.
2. This equation can be seen as a discrete version of a Hamilton-Jacobi equation

6
In fact such a sequence converges whenever the original sequence (βn ) does, and works better for f decreasing
with a large negative slope, because we can avoid oscillations of the original sequence.

2nd method : Gradient descent


The second method is to use a kind of gradient descent on the first variable of the cost (ie the individual
strategy). This is an equilibrium flow descent with an explicit Euler discretization (explained in [5] and
defined rigorously in [7]). Starting from a certain strategy α0 , we have :

αn+1 = αn − h · ∇1 u(αn+1 , αn ) . (2.14)

The first variable of the cost u is for the individual strategy (here αn+1 ) and the second variable is for the
global strategy β = αn . h is the step of the gradient descent. If this sequence converges until a certain α,
then β = α (we can said that it is a Nash “candidate” because individual and global strategies are the same)
and ∇1 u(α, α) = 0, we have reached at least a local minima at least. The authors of [5] use this method
successfully to find the Nash equilibrium of their model. The issue of this method is that we are not certain
to reach a Nash equilibrium, because a Nash equilibrium correspond to a “Nash candidate” which is a global
minimum of u with respect to the first variable α (that is the translation of “the best individual strategy in
response to the global strategy β = α is α itself”).

3 SIR model with a vaccination campaign [Following [2]]


3.1 Theoretical framework
We present here a first application of Mean Field Games to epidemic propagation made by Turnici in [2]
with 4 states : the Susceptible - Infected - Recovery - Vaccinated model. We add a global vaccination rate
u(t) to the SIR model which was presented in section 1 :

Ṡ = −β̂S(t)I(t) − u(t)

I˙ = (β̂S(t) − γ̂)I(t)

(3.1)
Ṙ = γ̂ · I(t) ,


V̇ = u(t)

with the initial conditions : S(0) = S0 , I(0) = I0 , R(0) = 0, V (0) = 0. S, I, R, V are respectively the
proportion of susceptible people, infected people, recovery people and vaccinated people. We define ψI the
probability that an individual which is not vaccinated is infected before t, which thus fulfill

ψ˙I = (1 − ψI (t))β̂I(t) . (3.2)

β̂ is the transmission rate of the disease and γ̂ the recovery rate as in (1.1), which one are taken fixed here.
We assume u(t) ≤ umax which is the maximal vaccination rate of the population (because of a maximal
capacity of vaccination) ; u is such that u(t) = 0 if S(t) = 0. In [2], the authors derive some mathematical
properties of such a model (like the existence and uniqueness of a Nash equilibrium). We propose here to
recap their main results and add some physical interpretations.

The author’s idea is to allow people to choose their own vaccination rate λ(t), this is the control parameter.
Then, u(t) will be simply the sum of individual choices, u(t) = S(t)·λ(t) (only susceptible could be vaccinated).
Thus, this model correspond to a discrete Mean Field Game (where S, I, R and V are the possible states).
We define ψv the probability that an individual which is not infected is vaccinated before t,

ψ˙v = (1 − ψv )λ(t) . (3.3)

We suppose that all individuals are equivalent, in the sense that they all search to minimize the individual
cost function.

One way to build this cost is the following : between t and t + dt, we have a probability (1 − ψI (t))(1 −
ψv (t))β̂I(t)dt to be infected (we must be susceptible at t) and in that case we pay the price of infection
rI e−Dt . On the other hand, we have a probability (1 − ψI (t))(1 − ψv (t))λ(t)dt to be vaccinated (we also have
to be susceptible at t) and we pay then rv e−Dt . The factor e−Dt is included for each cost because one prefers
to be infected (or suffer of undesirable effects) late rather than immediately. D is called the discount factor

7
which is supposed to be constant. rI is the cost of the infection (quite intuitive) and rv is the cost of the
vaccination (in money or undesirable effects), both are supposed to be constant (they can evolve in reality).
Thus we obtain a total cost function J :
Z ∞
J≡ (1 − ψI (t))(1 − ψv (t)) · (β̂I(t)rI e−Dt + λ(t)rv e−Dt )dt . (3.4)
0

We can check that we recover the formula of the paper [2] by identifying Φ̇I in the last formula and integrating
by parts : Z ∞
rI (ΦI (t) − ΦI (∞)) + rv e−Dt (1 − ψI (t)) dψv (t) , (3.5)
 
J(φv , u) = rI ΦI (∞) +
0

where ΦI (t) is defined as


Z t Z t
ΦI (t) ≡ e−Ds ψ˙I (s)ds = e−Ds (1 − ψI (s))β̂I(s)ds , (3.6)
0 0

which correspond to the probability to be infected before t weighted by the discount factor.

This form of the cost can be interpreted by considering each term separately : first, without vaccination,
we can say that an individual pay a price rI e−Dt when he is infected. But the time of infection is uncertain,
so we take the mean (over time of infection) of the price paid to construct the cost function :
Z ∞
JI = rI · e−Ds ψ̇I (s)ds = rI ΦI (∞) , (3.7)
0

where the last term ψ̇I (s)ds correspond to the probability to be infected between s and s + ds. Considering
now the impact of vaccination, the positive effect is that after a vaccination at time t, one will pay only
rI ΦI (t) instead of rI ΦI (∞) because the probability of infection vanishes after t. Taking the average over the
time of vaccination, this gives a positive impact (so a negative contribution to the cost) :
Z ∞
JV I = (rI ΦI (t) − rI ΦI (∞))dψv (t) < 0 , (3.8)
0

where dψv (t) is indeed the probability of vaccination at time t for people who were not infected before t
(because the cost is only reduced for these people).
Finally, we add the cost of vaccination (undesirable effect). One pay a price rv e−Dt for a vaccination at t
and the probability of vaccination at t is here (1 − ψI (t)) · (1 − ψv (t)) · λ(t) · dt = (1 − ψI (t)) · dψv (t) because
people who have already been infected will be never vaccinated. Thus the cost of the vaccination is :
Z ∞
JV = rv e−Dt (1 − ψI (t))dψv (t) , (3.9)
0

And we obtain indeed : J ≡ JI + JV I + JV .

Let us denote by g(t) the integrand of (3.5) :

g(t) = rI (ΦI (t) − ΦI (∞)) + rv e−Dt (1 − ψI (t)) , (3.10)

with its derivative :


ġ(t) = e−Dt (1 − ψI (t)) · [(rI − rv )β̂I(t) − rv D] . (3.11)
Considering the beginning of optimization at t, the minimization of the cost function J correspond to the
minimization of V : Z ∞ 
V (t) ≡ inf g(τ )dψv (τ ) . (3.12)
λ t

Physically, this means that in order to minimize V , people will wait for g(t) to become negative to get vacci-
nated (λ(t) > 0). This is natural because g(t) ≤ 0 correspond to the fact that the vaccination represent a gain
(in statistical average) for people who are still susceptible. We can briefly analyse the behavior of g : since
I(0) is small (beginning of the epidemic), ġ(0) < 0 and g reaches an extrema when I(t) = (r r−r vD
)·β̂
≡ Ithr
I v
(above this infection threshold there is a vaccination campaign). If Imax ≥ Ithr (Imax is the maximum of I
during the epidemic), g decreases until I(t) = Ithr (I increases with the beginning of the epidemic), increases
until I(t) = Ithr for the second time (passing by I(t) = Imax at the inflection point of g) and decreases finally

8
(end of the epidemic, I(t) −→ 0). We can remark that in all cases, g(t) −→ 0. If Imax ≤ Ithr , g decreases
t−→∞
and is always positive, there is no vaccination campaign. (NB : we consider only the relevant case where
g(0) ≥ 0)

We now show how that V obeys the following Hamilton-Jacobi-Bellman equation (HJB), while S(t) > 0 :
" #
− V̇ + min [(g(t) − V (t)) · λ(t)] = 0 . (3.13)
λ(t)∈[0, uS(t)
max ]

We propose an argument “à la Bellman” to derive the above formula (the reasoning is essentially the same as
in section 2.2). If we denote by it the state of an individual i at t, then (3.12) correspond to the value function
of a Mean Field Game describe by two states : the susceptible (which therefore can be vaccinated) and the
vaccinated. Indeed, if g(t) is the cost of the vaccination, λ our control parameter, then the argument of inf in
(3.12) is the total cost of such a game with two states. The cost function C(t) is here dependent of the state at
time t + dt. If we are susceptible at t and vaccinated at t + dt, then we pay g(t) at time t. If we are still suscep-
tible at t+dt we pay nothing at t (nothing happened). it = S if i is susceptible at t and it = V if i is vaccinated.

We have a Markov process, so P (it+dt = V |it = S) = λ(t)dt and P (it+dt = S|it = S) = 1 − λ(t)dt.
0 0
Furthermore, if i is vaccinated at t + dt, then the value function V (t + dt) = 0 since λ(t ) = 0 for t ≥ t
(the value function is unchanged if we are still susceptible, because nothing happened in the interval). Thus,
considering only the relevant case where V (t) 6= 0 (that is it = S), we obtain

(3.14)
  
V (t) = min Eit =S Vit+dt + C(t, it+dt ) ,
λ(t)

which gives, writing explicitly E and using the expression for the Markov process

V (t) = min [(1 − λ(t)dt) · V (t + dt) + λ(t)dt · g(t)] . (3.15)


λ(t)

At first order, λ(t) · dt · V (t + dt) = λ(t) · dt · V (t) and since V (t + dt) is independent of λ(t)

dV
− = min [(g(t) − V (t)) · λ(t)] . (3.16)
dt λ(t)

An equilibrium is realized when the global vaccination strategy u(t) matches with the individual choices λ(t).
Since all individuals are equivalents, this requires : u(t) = S(t)λ(t). It is a consistent equation translating the
Nash equilibrium. Indeed, if the best individual strategy in response to u is λ such that u = S · λ, we are at
equilibrium.

The authors of [2] shows under monotonicity assumptions that (3.13) has a unique solution V . Then,
they try a solution of the form λ(t) = uS(t) max
for t ∈ [t∗1 , t∗2 ] (and λ(t) = 0 otherwise) where t∗1 < t∗2 such that
g(t∗1 ) = V (t∗1 ) and g(t∗2 ) = V (t∗2 ) = 0. This is a particular solution of (3.13) and thus the solution because of
the uniqueness. There is no analytical expression of the times (t∗1 , t∗2 ), they must be estimate numerically.

They finally find a Nash equilibrium solution of the form u(t) = umax when t ∈ [t∗1 , t∗2 ] and u(t) = 0
otherwise. Thus, all the dynamics depends on (t∗1 , t∗2 ), we can therefore reduce our search space of different
strategies (individual and global strategies) to a space of two dimensions to find the Nash equilibrium easily.

3.2 Numerical approach


We propose to illustrate (and check) numerically the results of the paper [2]. We use the inductive sequence
defined in section 2.4 with a = 0.1 to find the Nash equilibrium. That is we implement the scheme (2.13)
with un (t) the sequence of global vaccination rates :

un (t) −→ I(un ) −→ fI(Un ) (un ) = λn −→ un+1 (t) = a·S(t)·λn (t)+(1−a)·un (t) . (3.17)
Kolmogorov Bellman Symmetric

Once we reach the convergence of un , we can plot the time evolution (free scale, we will say “weeks” in
the following for concreteness) of all compartments of our system. We observe in fig 1 that the vaccination

9
Figure 1 – Nash equilibrium of the SIR model with a vaccination campaign. Susceptible are (blue), Infected
(orange), Vaccinated (red) and Recovery (green).
Parameters : [S0 = 0.98, I0 = 0.02, rv = 0.2, rI = 0.4, D = 0.1, β̂ = 0.5, γ̂ = 0.1, umax = 0.1 ]

campaign appears between approximately 5 and 10 weeks. The vaccination slows down quickly the peak of
infected people and the total quantity of infected people is much lower than in the case without vaccination
(this quantity correspond to the proportion of recovery people at the end of the epidemic). To be convince, we
can compare with the classical SIR model without vaccination (and otherwise the same parameters) shown
on fig 2. We can see that the proportion of people “saved” by vaccination is almost one half of the total po-

Figure 2 – Typical evolution of the SIR model without vaccination. Susceptible are (blue), Infected (orange),
and recovery (green). Parameters : [S0 = 0.98, I0 = 0.02, β̂ = 0.5, γ̂ = 0.1 ]

pulation, which is more than the total number of vaccinated people. This illustrates that vaccination protect
globally the population as it allows to terminate quickly the epidemic. Naturally before vaccination, the two
figures are the same.

Further insights can be obtained by considering the value function V (t) and the vaccination cost g(t) on
fig 3, focusing in particular on the vaccination campaign. The global behavior that was expected in the dis-
cussion after (3.11) is recovered. Vaccination campaign occurs between t∗1 and t∗2 . The vaccination campaign
begins a little bit before the first extrema of g, that is when I(t) increases quickly and is just below Ithr
(which is the value of I where ġ(t) = 0). People see the fast evolution of I(t) and consider that it is time to
be vaccinated (because the risk of infection is high). At t∗2 , we are just after the inflection point of g, which

10
Figure 3 – Evolution of the vaccination cost g (blue) and the value function V (orange) around the vacci-
nation campaign with the same parameters as fig 1

correspond to I = Imax (see fig 1). Therefore, people understand that now the epidemic decreases, the risk
of being infected is lower and they can stop the vaccination.

On the other hand if the epidemic is not virulent enough, there is no vaccination campaign. This is
illustrated in fig 4 in which β̂ has been changed from 0.5 to 0.2 (we use a longer time scale because the
epidemic dynamics is slower). Now since the epidemic does not take off (Imax < Ithr = 0.5), the best strategy

Figure 4 – Left : Nash equilibrium of the SIR model without vaccination campaign (same color code as
figure 1). Right : evolution of g (blue) and the value function V (orange).
Parameters : [S0 = 0.98, I0 = 0.02, rv = 0.2, rI = 0.4, D = 0.1, β̂ = 0.2, γ̂ = 0.1, umax = 0.1 ]

is to take λ(t) = 0 ∀t. We do not have any vaccination here : V (t) = 0. Furthermore, we can check that
g(t) ≥ 0 ∀t, that is the vaccination represent an absolute cost for individuals at each time.

11
4 Genetic algorithms
4.1 Theoretical framework
We introduce a particular type of algorithm that are the genetic algorithms (GA) summarizing here the
chapter of [3] on this subject. GA are a way to solve an optimization problem, that is to find ~xopt such that
Q(~xopt ) = max(Q(~x)) for a certain function Q which is call the quality or the “fitness” function. This could
~
x
be very useful for Mean Field Games where the Bellman equation cannot be derived, that is Mean Field
Games where the “Bellman” step in the general scheme presented in (2.13) cannot be realized. Therefore
we will have to find the best individual response to a global strategy numerically without the Bellamn for-
malism. We solve directly the problem from the cost function (Q is thus the cost function up to a minus
sign). The general idea is to test some individuals strategies, compare the (individual) cost between them
and select the best ones in order to converge to the best individual strategy by imitating the natural selection.

Genetic algorithms (and more generally evolutionary computing) are based on the natural selection prin-
ciple. They follow the following scheme :
— We define our quality function Q(~x) that we want maximize.
— We define our search space. That is the space of parameters that we want to optimize. It could be
directly the space where ~x lives but it could be also a representation of ~x.
— We start with a certain (random) distribution of individuals (that are “candidates” ~x to maximize our
quality function Q)
— We make a “natural” selection of individuals : those with the best quality function are more adapted to
their environment and have more chance to survive. Individuals who survive are called “parents”.
— We create several offspring from parents. We make a mix in the characteristics of two (or more) ran-
dom parents to create a new individual. Furthermore, to get diversity and explore our space, we add
mutation : we add a little random value at one or more parameters of each offspring.
— Then we have a new generation of individuals and we can iterate the procedure in order to find ~xopt .
This type of algorithm is general. We have to choose the number of selected individuals (i.e the number
of parents at each generation), the number of offspring, the number of initial candidates. We also have to
define the offspring procedure (the way that we mix the characteristics of parents) and the mutation operator
(what characteristics do we change, and how many). All these quantities have an impact on the efficiency
of the algorithm. We have to adapt it to each problem. As an example, if the search space is very large, we
should take a large number of offspring compare to the number of parents to explore the space (we generally
work with a fix number of individuals at each generation for convenience, but it is not necessary).

The algorithm terminate after a fix number of iterations (generations) or when we are near enough from
the optimum of Q (if we know it). We can also imagine other process to end up the algorithm.

4.2 Application to the SIR model with vaccination


As a first application, we propose to recover the results of [2] with GA. Authors of [2] derive and solve the
Bellman equation for V in order to find the best individual vaccination strategy λ∗ . This is possible because
the cost function J (see (3.5)) is clear enough. But we can imagine situations where the cost function is very
complex and it is not possible to derive a Bellman equation. Thus, we want realize fI : u −→ λ∗ with a
genetic algorithm instead of use the Bellman paradigm.

Thus, we start from J, u (see section 3.1) and we want to recover λ∗ (which is of the form λ∗ = umax
S(t) on
[t∗1 , t∗2 ] and 0 otherwise).

We describe the general scheme applied to our problem :

— Our quality function is Q(λ) = −J(λ, u). We want maximize it for a certain global strategy u.
— Our search space is : S = {0 ≤ λ(t) ≤ uS(t)
max
, 0 ≤ t ≤ T } which correspond to the space of all possible
individual strategies. After discretization of time, we obtain a huge space with np finite dimensions (np
is the number of discretization points)

12
— Our initial candidates are taken randomly. We take 200 candidates per generation.
— As a first guess, we select simply the candidates with the best quality function. We choose 10 to 50
parents. (and so 190 to 150 offspring).
— Offspring procedure : we choose two parents A and B randomly and for each t,
λ(t)of f = λA (t)+λ
2
B (t)
+ uS(t)
max
· random(−1, 1). We define like this the offspring procedure and the
mutation operator.
— We end up the process after a certain number of iterations (ie a certain number of generations)
The problem is that S is huge because we have np dimensions (' 4000). Thus, when we take initial
candidates λ(t) randomly, the quality function Q(λ) is very bad (worse than λ(t) = 0 ∀t) and grows very
slowly at each generation. Actually, the algorithm explore a huge part of S which is flat and we are far from
the peak of Q because λ(t) has to be ordered for that.

To solve this issue, we propose to first take λ(t) = C = cste on [0, T ] and make an optimization on
this C. When the best C is find, we divide the interval into two smaller intervals [0, T2 ] and [ T2 , T ] and we
take λ(t) = cste on each one. For the first offspring generation with “two values”, we make mutations on
each interval to get new candidates with different values on [0, T2 ] and on [ T2 , T ]. This method works because
the search space is considerably diminished (only one dimension for the first step, then two, etc). When the
algorithm converges with two intervals, we cut each interval in two equals parts and we repeat the procedure.
We define a threshold in order to know if we reached the maximum of Q with a certain number of intervals
(that is we go further in number of intervals when the quality function do not evolve during few generations).

We test our algorithm with the same parameters as above (β̂ = 0.5) and since we do not know anything
about the best vaccination campaign, we start with a global vaccination rate u(t) = 0 ∀t. Numerical and
analytical results of section 3 allow to show that λ∗ = uS(t)
max
for t ∈ [t∗1 , t∗2 ] and 0 otherwise with t∗1 = 4.11
weeks and t2 = 21.89 weeks (we take the same parameters as the ones of 1). With the genetic algorithm, we

found the following results.

Figure 5 – Results of the genetic algorithm after 50 generations (16 intervals). The limit in orange is
λmax (t) = uS(t)
max
. In green are the analytical expected results and in blue the best candidate found by the GA

We see on fig 5 that the global behavior is recover by the algorithm : the vaccination occurs between
the right bounds (the first bound is not perfectly reach here) and the right intensity (we do not reach the
limit at each time because of the interval structure). An issue here was that there is a “plateau” in S because
g(t) −→ 0 quickly after the epidemic peak, so the algorithm found that there is almost no difference between
the above results and the same with a huge vaccination campaign at the end of the epidemic (this does not
change Q).

In order to avoid this, we had a new offspring process to our algorithm : a “cloning” process. That is we
clone many times the best candidate of a given generation. Then we associated with each interval some clones

13
(5 in our algorithm) and on this interval only we change the value of λ randomly (local mutation) for each
clone. We end up with many new offspring. We make a first selection among this clones to choose the best
ones. The cloning process allow us to explore this flat part of S and then we just have new candidates for the
next generation in competition with the “natural” offspring. In order to accelerate the process, we add also
a manual clone which is the one that maximize Q when we change only the value of λ on one interval (this
allows to cut the vaccination campaign at the end of epidemic quickly). These clones are also in competition
with other clones to be the offspring of a generation. At the end, we take 20 parents, 60 natural offspring
and 120 clones (which are selected among all clones), and so 200 candidates per generation.

We can go further and wait instead of few minutes (50 generations) some more time until 200 generations.
This is illustrated in fig 6 and fig 7.

Figure 6 – Results of the genetic algorithm after 200 generations (512 intervals). The limit in orange is
λmax (t) = uS(t)
max
. In green are the analytical expected result and in blue the best candidate found by the GA

Figure 7 – Focus on the vaccination campaign with the same color code as fig 6

When we focus on the vaccination campaign on fig 7, we can see that we recover perfectly the expected
results. That is our algorithm is able to compute fI (u) = λ∗ and thus to make the “Bellman arrow” of (2.13).
Then, we can use the inductive sequence (2.13) to find the Nash equilibrium of our problem without the
Bellman equation. This result for this example confirms that in principle, we will be able to use genetic
algorithms to simulate our models if the Bellman paradigm is not usable.

14
5 SIR model with a social structure
5.1 Theoretical framework
Now we want to develop Mean Field Game approach on models which present a mesoscopic description
of society. A first model that we will consider is a SIR model with a structure of social contacts defined in
[4]. We start with a a general description of this model before include Mean Field Games. The idea is to class
each individual by his age class i and define some settings where there are contacts between individuals :
the schools, the community, the workplaces and the households. Let introduce Mij k
the average frequency of
contact between an individual of age class i and someone of age class j in the setting k. Each individual of
age class i are equivalent in the sense that they have the same frequency of contacts with other individuals
(we take the average over individuals at this level). The effective number of contacts with individuals of age
class j is given when we multiply Mij k
by the contact rate Nik (t) for individuals of age class i in the setting
k. That is Nik (t) is only a number of contact for a certain time but we say nothing here about the nature of
contacts (the detailed about who are our contacts are in Mij k
). For example, a child at school will have lot of
contacts per days with other children but almost no contacts with adults (the teachers here) or old people.

In the paper [4], authors infers the matrix Mij


k
from the demographic structure. That is they use the com-
position in number and age of individuals with (at least one) contacts in each school, workplace, household.
Then they defined a frequency of contacts for each individual of age class i with individuals of age class j in
a certain household or in a certain school (if this individual has contacts with this school or this household).
They construct the matrix Mij k
by taking the average over individuals of age class i taking into account the
fact that some individuals have no contacts with some settings (for example adult people do not go to school,
unless they are teachers). This construction could be very useful to define and construct properly a “virtual
society”.

Now we can write down the SIR equations, indexing by i the susceptible/ infected/ recovery people of
age class i. We denote by q the probability of transmission (of the virus) per effective contact (between a
susceptible and an infected). We do not have any vaccination in this model. The SIR equations are (with n
age classes) :   
 X n X
q · Nik (t) · Mij k

Ṡi = −  · Ij (t) · Si (t)





 j=1 k




  
X n X (5.1)


 I˙i =  q · N k
i (t) · M k
ij · I j (t)  · Si − γ̂Ii (t) ,

j=1 k









Ṙi = γ̂ · Ii (t)

Indeed, the probability for someone of age class i to be infected between t and t + dt (if this person is still
susceptible at t) is  
Xn X
 q · Nik (t) · Mij
k
· Ij (t) · dt ≡ λi (t) · dt , (5.2)
j=1 k

for each setting k, for each age class j, our individual of age class i has an average number of effective contacts
Nik (t) × Mijk
during dt with someone of age class j in the setting k. For each contact, there is a probability
Ij that the contact person is infected and if this person is infected, there is a probability q that there is a
transmission of the disease during the contact. Finally, we naturally sum over all age class of the contacts and
all settings. Therefore, we obtain (5.2) and the SIR equations (5.1) follow. λi is called the “force” of infection
for an individual of age class i.

Notice that here we take for simplicity a recovery ratio γ̂ independent of i and we consider only 3
compartments (S,I,R). We could include more, but beyond an increase of the number of equations, this
would not change the structure of the problem, and the following analysis is extendable to these more refined
models. Note also that we assume the average frequency of contacts for individuals Mij
k
constant, which could
be not true in reality.

15
5.2 Mean Field Game approach
We propose to derive a Mean Field Game version of this model. We consider n types of agent. An agent
of type i is an individual of age class i in our society. The control parameters of the agents (of type i) are
the contact rates of individuals of age class i in the setting k Nik (t). Thus, we have a Mean Field Game with
3n states (S1 , S2 , ..., Sn , I1 ..., R1 , ...) but each agent of type i can only be in Si , Ii or Ri .

The idea is that if the epidemic is virulent, agents will reduce their contact rates Nik (t) during this period
to avoid infection but this reduction has a certain cost f {Nik } . We assume this cost to be decreasing, with
a second derivative positive. Indeed, this means that the larger the effort made to reduce Nik , the higher the
increases of the price to be paid. For the cost of infection due to the epidemic, we take a simple form rI (i)
when an individual of age class i is infected. This cost increases with i, modeling that we suffer more from
infection when we are older.
Introducing φiI (t) the probability for an individual of age class i to be infected before t, an infection for
this individual happens between t and t + dt with a probability (1 − φiI (t)) · λi (t) · dt. From an egoistic point
of view, the total cost associated with the reduction of contacts is zero when we are infected, because the
risk of a new infection is zero also. Thus, the total cost function for an individual of age class i (i is fixed
in the following), starting the optimization at t for a certain individual strategy {Nik } (and a certain global
strategy which is implicit in the notation) is
Z T
U i ({Nik }, t) ≡ λi (s) · rI (i) + f {Nik (s)} · (1 − φiI (s)) · ds . (5.3)
 
t

Therefore the value function is :


U i (t) = min U i {Nik }, t . (5.4)
 
{Nik }

Now, we are able to derive the Hamilton Jacobi Bellman equation follows by U i . We denote by Stit the state
of a reference agent of age class i at t. The Markov process of this Mean Field Game is :

P (Stit+dt = Ii |Stit = Si ) = λi (t)dt

P (Stit+dt = Si |Stit = Si ) = 1 − λi (t)dt . (5.5)

P (Stit+dt = Ri |Stit = Ii ) = γ̂dt

We use a Bellman argument to find the evolution of U i backwards in time (we consider as in previous cases
that the state at t is Si otherwise U i (t) = 0) :
h i
U i (t)|Stit =Si = min EStit =Si U i (t + dt)|Stit+dt + C i (t)|Stit+dt . (5.6)
{Nik (t)}

From the Markov process (5.5) and the definition of U i we have :

U i (t + dt) = 0 if Stit+dt = Ii and is unchanged if Stit+dt = Si .

C i (t) = f {Nik (t)} · dt + rI (i) if Stit+dt = Ii and C i (t) = f {Nik (t)} · dt if Stit+dt = Si .
 

Writing explicitly the expectation in (5.6)

U i (t) = min λi (t)dt rI (i) + f {Nik (t)} dt + (1 − λi (t)dt) U i (t + dt) + f {Nik (t)} dt ; (5.7)
    
{Nik (t)}

this gives the Hamilton-Jacobi-Bellman equation of our Mean Field Game


" #
dU i (t) i k
(5.8)
  
− + min λi (t) rI (i) − U (t) + f {Ni (t)} =0.
dt {Nik (t)}

Where λi (t) depends on Nik (t) (see its definition (5.2)). For an epidemic evolution given by the SIR system
(5.1), we can compute the behavior of our reference agent of age class i. This agent will take an optimal
strategy Nik∗ at each time t

{Nik∗ (t)} = argmin λi (t) rI (i) − U i (t) + f {Nik (t)} . (5.9)


  
{Nik (t)}

16
5.3 Practical implementation
We propose a form for f inspired from [5] where the cost of the contact rate (β̂ in the paper) is of the
form
β̂0
c(β̂) = − 1 (β̂ ∈ [β̂min , β̂0 ]) , (5.10)
β̂
expressing the discrepancy between the strategy chosen, β̂, and the strategy without efforts β̂0 . There is no
cost associated with the “effortless” strategy β̂0 , and the farther is β̂ from β̂0 , the higher the price to be paid.

In the same spirit, we propose here a cost of the form :


 k µk
Ni0
k
c(k, Ni ) = −1, (5.11)
Nik
which has the same qualitative properties as (5.10), but here µk models the “attachment” to the setting k.
It is for example easier to reduce once contacts at work rather than inside once family. Notice that this cost
function is clearly decreasing, with a positive second derivative. We furthermore impose bounds of Nik : a
minimum of contact rate Nimin k
(denoted after simply Nimk
) and a maximum Ni0 k
which is the contact rate
without efforts. Then, the total cost of contact reduction is :
 X
f {Nik (t)} = c(k, Nik (t)) . (5.12)
k

Notice that we take a form of the cost independent of the age class i (for simplicity and because we assume
that contact reduction is painful regardless of age).

If we use our expression of f (5.12), we can derive the expression of argmin in (5.9) noticing that
  
 k µk
X X Ni0
λi (t) rI (i) − U i (t) + f {Nik (t)} =  k
Ij (t)Nik (t)(rI (i) − U i (t)) +
  
q Mij k
− 1 .
k j
N i

(5.13)
Then, since each Nik appears in only one term, we can minimize independently each term of the sum. We
can see that this formula diverges asymptotically, when Nik tends to 0 (f diverges) or when Nik diverges (λi
diverges). Furthermore, we can check by elementary computations that this quantity decreases until a unique
minimum and increases afterwards.

The minimum is the minimum of a standard function with one variable (see (5.13))
!
X X
argmin [...] = argmin [...] , (5.14)
{Nik }∈[Nim
k ,N k ]
i0 k
k k k
k Ni ∈[Nim ,Ni0 ]

and by taking the derivative equals to zero, we find that


! µ 1+1
k µk k
µk (Ni0 )
Nik∗ = P k
. (5.15)
q j Mij Ij · (rI (i) − U i (t))

This formula correspond indeed to the true optimal Nik∗ if we are inside the bounds [Nim
k k
, Ni0 ]. If we are
above Ni0 , the optimal Ni is Ni = Ni0 and if we are below Nim , the optimal Ni is Ni = Nim
k k∗ k∗ k k k∗ k∗ k
.

We can understand the behavior of the value function U i from (5.8) : as t −→ ∞, U i (∞) = 0 because
the total cost is zero if we begin the optimization at the end of epidemic. When we go backwards in time,
rI (i) − U i (t) ≥ 0 and thus the minimum in (5.8) is positive implying :

dU i
− ≥0. (5.16)
dt
i
If at a time t U i (t) reaches rI (i), then dU
dt = 0 (because the optimal Ni is Ni0 and U (t) becomes constant).
k k i

Thus, (5.16) is true for all t and therefore U decreases. Remark : by analyzing directly (5.8), U i (t) tends to
i

rI (i) from below and in this limit, Nik∗ = Ni0 k


.

17
5.4 Numerical simulations
Our numerical simulations are made under the following assumptions.
— We take only 3 age classes (youth, adults, retired), corresponding respectively to the ith column or row
in our contact matrices or directly to i = 1, 2, 3.
— rI (i) = rI × 10i (to simulate the impact of infection when we are older). We use rI = 1 (unless other
indication) to get a competition between the infection cost and the cost due to contact reduction.
— We take 4 settings as in [4] : schools, workplaces, community and households.
— Our matrices Mijk
are inspired from the data analysis of [4] to obtain a qualitative comprehension of
agent’s behavior. But we do not made any fit to get these matrices.
   
0.95 0.05 0 0 0 0
M S = 0.005 0.005 0 ; M W = 0 1 0
0 0 0 0 0 0
   
0.25 0.5 0.25 0.3 0.5 0.2
MC = 0.25 0.5 0.25 ; M H = 0.5 0.3 0.2
0.25 0.5 0.25 0.3 0.2 0.5

— We choose Ni0k
in the same way from [4]. We write it as a matrix for concision (first row for schools,
second for workplaces, third for community, fourth for households) :
 
7.5 2.5 0
0
0 5 0
N(ki) = 
2.5 2.5 2.5
2.5 2.5 2.5

Nk
We set Nimk
= Ai0
k
. That is the minimum contact rate is independent of i. We take AS = 3 (schools),
AC = AW = 5 (community and workplace), AH = 2 (Household) to induce the fact that we cannot
reduce as much our contacts in our household as inside schools or at work.
— To induce the fact that the reduction of contacts is harder inside households and easier in community,
we take : µS = µW = 2, µH = 3, µC = 1
— We take 25% of “young” and “retired” in our society and 50% of adults.
— We take a probability of transmission by real effective contact q = 0.2 and we take the recovery rate
γ̂ = 0.1. Then, we work with fix initial conditions : (Si (0), Ii (0)) = ([0.98, 0.98, 0.98], [0.02, 0.02, 0.02]).
Since all these quantities are typical, observed behaviors are a priori general. This was actually check by
running many simulations with different parameters.

As in previous MFG, we use the inductive sequence procedure of section 2.4, with the following scheme :

{N̄ik }n −→ I({N̄ik }n ) −→ f ({N̄ik }n ) = {Nik∗ }n −→ {N̄ik }n+1 = {Nik∗ }n . (5.17)


Kolmogorov Bellman Symmetric

Notice that here we were be able to find the optimal individual response Nik∗ through an analytical mini-
mization of the Bellman equation (5.8), namely (5.15) where there is an implicit dependence in the global
strategy {N̄ik } in Ij (t). We follow the above scheme until Nik∗ (t) = N̄ik (t) ∀ t, k, i, obtaining in this way the
Nash equilibrium. This Nash equilibrium is actually reached quite quickly (few iterations) with this model,
because the cost function is sufficiently flat.

18
Figure 8 – Evolution of epidemic by age group. From left to right : young, adult and retired epidemic.
Recovered are in green, Infected in orange, Susceptible in blue

We can see on fig 8 that epidemic peaks are around 3 and 4 weeks, but the precise time of the peak is
different for each classes because of the interactions and different behaviors. We observe that the number of
infected retired is very low compare to the one of young people, which is due to the fact that being infected
is more dangerous for retired people than for adults and young. We can also observe that there is a little
epidemic peak for the retired despite their efforts from the beginning to reduce epidemic, which is due to
interactions with other classes.

Figure 9 – Evolution of global epidemic. Comparison between the Nash equilibrium strategy and the strategy
without efforts (dot lines). Recovered are in green, Infected in orange and Susceptible in blue

Fig 9 illustrates the effect of contact reduction compared with the scenario where no change in the
control rates are made. The peak of epidemic comes earlier and is less important with the Nash equilibrium.
The importance of contact reduction is significantly affected by rI which determines how much efforts the
individuals are ready to make to reduce their contacts (with a small rI we observe a little reduction and with
a very high rI the reduction is maximum). We can check it by plotting contact rates.

19
Figure 10 – Evolution of contact rates. Retired people are in blue, adults in orange and young in green

Fig 10 shows the different behaviors of agents : retired people reduce indeed their contacts because they
have more risk. This reduction is very significant in community (they reaches Nmin for a certain period of
time) because it is easy for them to reduce their contacts, but in households, we see that the period of reduc-
tion is less important. The situation for adults is intermediate : they reduce their contacts in community and
in workplaces during the epidemic peak, but they do not change anything in households because the peak is
not high enough. For young people, the risk with such epidemic is low and therefore, they do not change their
behavior. Notice that in schools, adults do not change their contacts because the probability for an adult to
be in contact with young at school is very low.

Figure 11 – Evolution of contact rates. Retired are in blue, adults in orange and young in green. We take
here rI = 5 (instead of rI = 1 above)

As we expected, we see on fig 11 that when the cost due to infection is higher, the contact reduction
is much more drastic. Young start to reduce their contacts in schools. Furthermore, to avoid an epidemic
rebound the growth of contacts after the epidemic peak is very low.

20
5.5 Other method to reach the Nash equilibrium
The inductive sequence method used until here works well because functions are contracting (see section
2.4). But sometimes it is not the case and this method does not converge. This is the case in particular in
our model if we want to implement global constraints, such as the ones that would be implied by a lockdown
(for example, Nik ∈ [Ni,m
k k
, Ni0 ] if I < Isat and Nik = Ni,m
k
if i > Isat ).

The impact on the cost function is that a small change of the global strategy N̄ik change I and therefore
the bounds of the “lockdown”, implying that the associated cost paid by individuals change a lot. Our cost
function is not contracting anymore and this is why the inductive sequence does not converge.

Instead of this, we can use the second method presented in section 2.4 : the gradient descent. This method
find a priori a local minima (with respect to the individual strategy) among the Nash candidates. But we
can test if we reach the same equilibrium than the first method (when the latest converges).

In order to implement it numerically, we have to compute the Gateau derivative of the cost (see [31] for a
rigorous definition and [5] for a similar application to what we do here). This cost is (5.3) at t = 0. Note that
for clarity and to avoid heavy notations, we denote after {N̄ik } by N̄ , {Nik } by N , N (t) by Nt and ΦiI (t) by
ΦN,
i

(t) :
Z T
i
C (N, N̄ ) = [λiN,N̄ (t) · rI (i) + f (Nt )] · (1 − φN,
i

(t)) · dt . (5.18)
0

With the Markov process (5.5) we know that

λN,N̄
Rt
φN,
i

(t) = 1 − e− 0 i (s)·ds
, (5.19)

by integrating over the first term of C we obtain (we recognize φ̇)


Z T
i
C (N, N̄ ) = f (Nt )(1 − φN,
i

(t))dt + rI (i)φN,
i

(T ) . (5.20)
0

Now we can compute the Gateau derivative Dh C of C with respect to the first variable in the direction h :
1 i
Dh C i (N, N̄ ) ≡ lim (C (N + h, N̄ ) − C i (N, N̄ )) (5.21)
−→0 
Z Th i
N,N̄
= rI (i)Dh φi (T ) + Dh f (Nt )(1 − φN,
i

(t)) − f (Nt )Dh φN,
i

(t) dt , (5.22)
0

and the two Gateau derivatives appearing in (5.22)


t
dλN,N̄
Z
1 N +h,N̄ (s)
Dh φiN,N̄ (t) ≡ lim (φi (t) − φN,
i

(t)) = (1 − φN,
i

(t)) · h(s) i
ds (5.23)
−→0  0 dN
1 0
Dh f (Nt ) ≡ lim (f (Nt + ht ) − f (Nt )) = f (Nt )ht . (5.24)
−→0 

We therefore obtain by replacing in (5.22) :

Dh C i (N, N̄ ) = h h, ∇1 C i (N, N̄ ) iT , (5.25)

where
dλiN,N̄ (.)   0
∇1 C i (N, N̄ )(.) = rI (1 − φN,
i

(T )) − L. (N, N̄ ) + (1 − φN,
i

(.))f (N. ) , (5.26)
dN
with Z T
Lt (N, N̄ ) ≡ f (Ns )(1 − φN,
i

(s))ds . (5.27)
t

Equation (5.26) is used for numerical simulation of the gradient descent. Writing (2.14) explicitly for the
cost :

N n+1 = N n − h · ∇1 C(N n+1 , N n ) . (5.28)

21
We use ∇1 C(N n , N n ) numerically (we take the first order and it was sufficient). h is the step of the gradient
descent here and N means as previously {Nik }.

Now we can check on fig 12 the convergence of the gradient descent method until the Nash equilibrium
find with the inductive sequence method :

Figure 12 – Convergence of two methods with the same parameters as fig 10. Method indicating by -1 :
Inductive sequence. Method indicating by -2 : gradient descent

We clearly see on fig 12 a great convergence of the gradient descent method towards the Nash equilibrium
find by the inductive sequence method. This second method could be useful to treat the cases where the first
method does not converge (it is a slower method but more robust to the form of the application). Even if
we cannot guarantee that the minima obtain in this way is a true minima (i.e a true Nash equilibrium) and
not just a local minima, this can be checked afterwards : by inserting this obtained results as an entry of the
iterative scheme (5.17) and check that this is indeed a fixed point. If this turns out not to be the case, we
can think of an hybrid algorithm (with a mix of the two above methods) to converge effectively towards the
true Nash equilibrium.

22
6 Conclusion
In our work, we develop a SIR model with a social structure where the contact rates between individuals
are intrinsic instead of extrinsic and treated with the Mean Field Game paradigm. We find the Nash equi-
librium using two different approaches, inductive sequence and gradient descent which allows us to address
models for which the inductive sequence method does not works. We develop also a genetic algorithm which
realize the optimization of the problem and allow us to solve numerically a Mean Field Game when the
Bellman equation cannot be derived.

We have developed some tools which can be used to further develop this line of research. Such extension
of our work could be, in the sort term :
• Add global constraints ([21], [22])
• Study other kinds of mesoscopic description
On the longer term :
• Add a spatio-temporal dynamics to our models ([26], [27], [28])
• Calibrate the cost functions (i.e comparison with empirical data and discussion with other researchers)

We can finally think about other models with other approaches to describe heterogeneous interactions in
the society. An example are the networks based models ([29], [30]), we can try to homogeneous them at an
appropriate level to model globally the spread of epidemics and use Mean Field Games on it.

Finally, I would like to thanks sincerely Denis Ullmo for his great support and all the LPTMS team for
their warm welcome to the Laboratory.

23
Références
[1] Diogo A Gomes, Joana Mohr, Rafael Rigão Soura. Coutinous time finite state Mean Field Games. Springer
Science+Business, 2013
[2] Laetitia Laguzet, Gabriel Turinici, Ghozlane Yahiaoui. Equilibrium in an individual - societal SIR vac-
cination model in presence of discounting and finite vaccination capacity. Viorel Barbu, Cătălin Lefter,
Ioan I. Vrabie. New Trends in Differential Equations, Control Theory and Optimization, World Scientific
Publishing Co, pp.201 - 214, 2016
[3] Introduction to evolutionary computing (natural computing series, Springer). A.E.Eiben and J.E.Smith,
Second edition, 2015.
[4] Fumanelli L, Ajelli M, Manfredi P, Vespignani A, Merler S (2012) Inferring the Structure of Social Contacts
from Demographic Data in the Analysis of Infectious Diseases Spread. PLoS Comput Biol 8(9) : e1002673.
doi :10.1371/journal.pcbi.1002673
[5] R. Elie, E. Hubert, and G. Turinici. Contact rate epidemic control of COVID-19 : an equilibrium view.
Math. Model. Nat. Phenom., 15(35), 2020.
[6] Jean-Pierre Bourguignon, Calcul variationnel, Palaiseau, Éditions de l’École Polytechnique, 2008, 328 p.
(ISBN 978-2-7302-1415-5, notice BnF no FRBNF41120749, présentation en ligne [archive]), p. 7 et 27-28.
[7] G. Turinici, Metric gradient flows with state dependent functionals : The Nash-MFG equilibrium flows
and their numerical schemes. Nonlinear Anal. 165 (2017) 163-181.
[8] Bailey, Norman T. J. (1975). The mathematical theory of infectious diseases and its applications (2nd
ed.). London : Griffin. ISBN 0-85264-231-8.
[9] Herbert W. Hethcote, The Mathematics of Infectious Diseases https ://doi.org/10.1137/S0036144500371907
[10] Haijiao Li, Shangjiang Guo, dynamics of a SIRC epidemiological model, Electronic Journal of Differential
Equations, Vol. 2017 (2017), No. 121, pp. 1–18. ISSN : 1072-6691
[11] Maba Boniface Matadi, On the integrability of the SIRD epidemic model, Commun. Math. Biol. Neu-
rosci., 2020 (2020), Article ID 47
[12] Gao, Shujing ; Teng, Zhidong ; Nieto, Juan J. ; Torres, Angela (2007). “Analysis of an SIR Epidemic
Model with Pulse Vaccination and Distributed Time Delay”. Journal of Biomedicine and Biotechnology.
2007 : 64870. doi :10.1155/2007/64870. PMC 2217597. PMID 18322563.
[13] https ://modelisation-covid19.pasteur.fr
[14] Henrik Salje, Cécile Tran Kiem, Noémie Lefrancq, Noémie Courtejoie, Paolo Bosetti, Juliette Paireau,
Alessio Andronico, Nathanaël Hozé, Jehanne Richet, Claire-Lise Dubost, Yann Le Strat, Justin Lessler,
Daniel Levy-Bruhl, Arnaud Fontanet, Lulla Opatowski, Pierre-Yves Boelle, Simon Cauchemez Estimating
the burden of SARS-CoV-2 in France, Science (10 Jul 2020) https ://doi.org/10.1126/science.abc3517
[15] J.-M. Lasry and P.-L. Lions, Jeux à champ moyen. I-Le cas stationnaire. C R Math. 343 (2006) 619-625.
[16] J.-M. Lasry and P.-L. Lions, Jeux à champ moyen. II-Horizon fini et contrôle optimal. C R Math. 343
(2006) 679-684.
[17] J.-M. Lasry and P.-L. Lions, Mean Field Games. Jpn. J. Math. 2 (2007) 229-260.
[18] M. Huang, R. P. Malhamé, and P. E. Caines. Large population stochastic dynamic games : closed-loop
McKean–Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst., 6(3) :221–252,
2006.
[19] R. Carmona, F. Delarue, et al. Probabilistic Theory of Mean Field Games with Applications I-II. Sprin-
ger, Berlin (2018).
[20] D.Ullmo, I. Swiecicki, Thierry Gobron, Quadratic Mean Field Games.
https ://www.sciencedirect.com/science/article/pii/S0370157319300018
[21] R. Morton and K.H. Wickwire, On the optimal control of a deterministic epidemic. Adv. Appl. Prob. 6
(1974) 622-635.
[22] K. Wickwire, Optimal isolation policies for deterministic and stochastic epidemics. Math. Biosci. 26
(1975) 325-346
[23] F.D. Sahneh, F.N. Chowdhury and C.M. Scoglio, On the existence of a threshold for preventive behavioral
responses to suppress epidemic spreading. Sci. Rep. 2 (2012) 632.
[24] A. Rizzo, M. Frasca, M. Porfri, Effect of individual behavior on epidemic spreading in activity-driven
networks. Phys. Rev. E 90 (2014) 042801.

24
[25] D.Mistry, Inferring high-resolution human mixing patterns for disease modeling. arXiv :2003.01214
[26] Merler S, Ajelli M, Pugliese A, Ferguson NM (2011) Determinants of the spatiotemporal dynamics of the
2009 H1N1 pandemic in Europe : Implications for real-time modelling. PLoS Comput Biol 7 : e1002205.
[27] Ciofi Degli Atti ML, Merler S, Rizzo C, Ajelli M, Massari M, et al. (2008) Mitigation Measures for
Pandemic Influenza in Italy : An Individual Based Model Considering Different Scenarios. PLoS ONE 3 :
e1790
[28] Viboud C, Bjornstad ON, Smith DL, Simonsen L, Miller MA, et al. (2006). Synchrony, waves, and spatial
hierarchies in the spread of influenza. Science 312 : 447–451.
[29] Eubank, S., Guclu, H., Anil Kumar, V. et al. Modelling disease outbreaks in realistic urban social
networks. Nature 429, 180–184 (2004). https ://doi.org/10.1038/nature02541
[30] Meyers LA, Newman MEJ, Pourbohloul B (2006) Predicting epidemics on directed contact networks. J
Theor Biol 240 : 400–418
[31] Gateaux, R (1919), “Fonctions d’une infinité de variables indépendantes”, Bulletin de la Société Mathé-
matique de France, 47 : 70–96.

25

You might also like