SUTRA - ML Case Study
SUTRA - ML Case Study
Manindra Agrawal
IIT Kanpur
Existing Models for Pandemics
Modelling of Pandemics
• Pandemics such as plague, flu, cholera exhibit sharp rise and fall:
𝑆 (𝑡 )+ 𝐼 (𝑡 )+ 𝑅 (𝑡 )=1
SIR Model: Spread of Infection
• Susceptible get infected in proportion to both and :
• Suppose an infected person meets k persons on average per day and transfers
infection to each with probability p.
• Then one infected person newly infects persons in one day.
• Therefore, fraction of infected persons in one day is:
𝑑𝑆
=− 𝛽 𝑆 𝐼
𝑑𝑡
SIR Model: Removal of Infected
• Infected get removed in proportion to :
𝑑𝑅
=𝛾 𝐼
𝑑𝑡
SIR Model: Change in Infected
• changes with new infections coming in and earlier infected removed:
𝑑𝐼
= 𝛽𝑆 𝐼 −𝛾 𝐼
𝑑𝑡
SIR Model: Fatalities
• Fatalities are a subset of and change in is also proportional to :
𝑑𝐷
=𝜂 𝐼
𝑑𝑡
• Constants , , and determine the trajectory of the pandemic.
Herd Immunity
• Peak of the pandemic is when is maximum, or:
𝑑𝐼
= 𝛽 𝑆 𝐼 − 𝛾 𝐼 =0
𝑑𝑡
• At this time:
𝛾 1
𝑆= =
𝛽 𝑅0
If , there is no spread!
Estimating Parameter Values
Parameter Values from Data
• The SIR model has three parameters:
• How does one compute their values for a specific pandemic?
• Can it be done from the data about the pandemic?
• Yes, provided data is accurate.
• Let us see how to estimate .
Estimating in SIR Model
• Let denote fraction of new infections at time t.
• Then:
𝑁 =𝛽 𝑆 𝐼 = 𝛽 ( 1− 𝐼 − 𝑅 ) 𝐼 = 𝛽 𝐼 − 𝛽 ( 𝐼 + 𝑅 ) 𝐼
• Alternately:
1
𝐼= 𝑁+( 𝐼+𝑅) 𝐼
𝛽
Estimating in SIR Model
• Let denote respective actual numbers at time t where is the
population of region.
• Then:
^ 1 ^ 1
𝐼= 𝑁+ (^
𝐼+ ^ ^
𝑅) 𝐼
𝛽 𝑃0
Estimating in SIR Model
• This demonstrates a linear relationship between , , and .
• If these three quantities can be measured, then parameter can be
estimated.
Removed
Susceptible Undetected
Tested +ve Removed
• Dynamics:
𝑑𝑆
=− 𝛽 𝑆 𝑈 𝑑𝑈 𝑑 𝑅𝑈
𝑑𝑡
=𝛽𝑆𝑈 −𝑁 𝑇 −𝛾𝑈 =𝛾 𝑈
𝑑𝑡 𝑑𝑡
Adding and , and reduces to SIR model
SUTRA Model: Transition from U to T
• Standard way is to choose
• However, this is not a good choice since only recently infected move
from to
• Due to contact tracing protocol
• Also, analysis is hard
• Idea: is a better approximation of size of recently infected population
for constant .
• Hence, set
• This also makes the analysis very neat!
SUTRA Model: Analysis
• Compare equations for and :
𝑑𝑈
+𝛾 𝑈= 𝛽(1− 𝜖)𝑆𝑈
𝑑𝑡
• This gives:
^ 𝑈)
𝑑 (𝑇 − 𝜖
^ 𝑈),𝜖
=− 𝛾 ( 𝑇 − 𝜖 ^ =𝜖 /(1 − 𝜖)
𝑑𝑡
SUTRA Model: Analysis
• Therefore:
−𝛾𝑡
^
𝑇 =𝜖 𝑈 + 𝑎𝑒
• Thus quickly converges to .
• We also get, for a constant :
^ (𝑇 + 𝑅𝑇 )+ 𝑐
𝑈 + 𝑅𝑈 =1/ 𝜖
𝑁 𝑇 =𝜖𝛽𝑆𝑈
SUTRA Model: Analysis
• Resulting in:
1 1
𝑇= 𝑁𝑇 + (𝑇 +𝑅𝑇 ) 𝑇
𝛽 ( 1−𝜖 )( 1−𝑐 ) 𝜖 ( 1−𝑐 )
with and .
• Let as before.
SUTRA Model: Analysis
• This gives:
𝑒
𝑇^ =𝑏 𝑁
^ 𝑇 + ( 𝑇^ + ^𝑅𝑇 ) 𝑇^
𝑃0
Fundamental sutra of the model
SUTRA Model: Parameters
^ ^ 𝑒 ^ ^ ^
𝑇 =𝑏 𝑁 𝑇 + ( 𝑇 + 𝑅𝑇 ) 𝑇
𝑃0
15000.0
T–b*N
10000.0
5000.0
0.0
0.0 100000000.0 200000000.0 300000000.0 400000000.0 500000000.0 600000000.0 700000000.0 800000000.0 900000000.0
-5000.0
T * (T + RT)
120000.0
100000.0
80000.0
T – b* N
60000.0
40000.0
20000.0
0.0
0.0 50000000000.0 100000000000.0 150000000000.0 200000000000.0 250000000000.0
-20000.0
T * (T + RT)
b = 6.29
India Data e = 165
R2 = 0.999
1200000.0
1000000.0
800000.0
T–b*N
600000.0
400000.0
200000.0
0.0
0.0 2000000000000.0 4000000000000.0 6000000000000.0 8000000000000.0 10000000000000.0 12000000000000.0
T * (T + RT)
b = 6.68
India Data e = 82
R2 = 0.999
2000000.0
1500000.0
T–b*N
1000000.0
500000.0
0.0
28000000000000.0 30000000000000.0 32000000000000.0 34000000000000.0 36000000000000.0 38000000000000.0 40000000000000.0
T * (T + RT)
b = 4.93
India Data e = 83.6
R2 = 0.999
2000000.0
1500000.0
T–b*N
1000000.0
500000.0
0.0
18000000000000.0 20000000000000.0 22000000000000.0 24000000000000.0 26000000000000.0 28000000000000.0 30000000000000.0 32000000000000.0
T * (T + RT)
b = 4.29
India Data e = 83.1
R2 = 0.999
800000.0
700000.0
600000.0
T–b*N
500000.0
400000.0
300000.0
200000.0
100000.0
0.0
11000000000000.0 11500000000000.0 12000000000000.0 12500000000000.0 13000000000000.0 13500000000000.0
T * (T + RT)
b = 2.64
India Data e = 68.3
R2 = 0.999
1200000.0
1000000.0
800000.0
T–b*N
600000.0
400000.0
200000.0
0.0
16000000000000.0 18000000000000.0 20000000000000.0 22000000000000.0 24000000000000.0 26000000000000.0 28000000000000.0
T * (T + RT)
b = 3.52
India Data e = 38.6
R2 = 0.999
16000000.0
14000000.0
12000000.0
T–b*N
10000000.0
8000000.0
6000000.0
4000000.0
2000000.0
0.0
100000000000000.0 200000000000000.0 300000000000000.0 400000000000000.0 500000000000000.0 600000000000000.0
T * (T + RT)
Observations
The equation holds for ~62% days in the entire timeline
This explains why does the relationship not hold for some days
Spread of Pandemic: A Relook
• Suppose the pandemic is spreading in China, but not in other
countries of Asia
• The population used for computing fractions would be of China, not
of Asia
• Then why should one use population of whole India when the
pandemic just started?
• At the time, it was confined to pockets of metro cities
• Therefore, even the population changes with time!
Spread of Pandemic: A Relook
• Let denote the population of region within reach of the pandemic at
time .
• Define , a new parameter that increases over time from to , where is
the total population of the region.
• Parameter is called reach of the pandemic.
Spread of Pandemic: A Relook
• The fundamental sutra changes to:
^ ^ 𝑒 ^ ^ ^
𝑇 =𝑏 𝑁 𝑇 + ( 𝑇 + 𝑅𝑇 ) 𝑇
𝑃0
with and .
400,000
350,000
300,000
Infections
250,000
200,000
150,000
100,000
50,000
0
4/1/2020 7/10/2020 10/18/2020 1/26/2021 5/6/2021 8/14/2021 11/22/2021 3/2/2022
Date
Actual Data
India: Pandemic Spread Captured by Model
Detected New Infections (7 day average)
450,000
400,000
350,000
300,000
Infections
250,000
200,000
150,000
100,000
50,000
0
4/1/2020 7/10/2020 10/18/2020 1/26/2021 5/6/2021 8/14/2021 11/22/2021 3/2/2022
Date