0% found this document useful (0 votes)
21 views

SUTRA - ML Case Study

Uploaded by

Abhira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

SUTRA - ML Case Study

Uploaded by

Abhira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 54

The COVID SUTRA

Manindra Agrawal
IIT Kanpur
Existing Models for Pandemics
Modelling of Pandemics
• Pandemics such as plague, flu, cholera exhibit sharp rise and fall:

Spanish flu deaths in UK (Source: https://doi.org/10.3201/eid1201.050979, CC)


Modelling of Pandemics
• To explain this phenomenon, Kermack-McKendrik (1927) proposed a
mathematical model called SIR model.

Susceptible Infected Removed

• Susceptible: population not yet infected


• Infected: population with infection
• Removed: population no longer infected (includes fatalities)
SIR Model
• Let , , and represent fraction of population in each of three groups at
time .

𝑆 (𝑡 )+ 𝐼 (𝑡 )+ 𝑅 (𝑡 )=1
SIR Model: Spread of Infection
• Susceptible get infected in proportion to both and :
• Suppose an infected person meets k persons on average per day and transfers
infection to each with probability p.
• Then one infected person newly infects persons in one day.
• Therefore, fraction of infected persons in one day is:

𝑑𝑆
=− 𝛽 𝑆 𝐼
𝑑𝑡
SIR Model: Removal of Infected
• Infected get removed in proportion to :

𝑑𝑅
=𝛾 𝐼
𝑑𝑡
SIR Model: Change in Infected
• changes with new infections coming in and earlier infected removed:

𝑑𝐼
= 𝛽𝑆 𝐼 −𝛾 𝐼
𝑑𝑡
SIR Model: Fatalities
• Fatalities are a subset of and change in is also proportional to :

𝑑𝐷
=𝜂 𝐼
𝑑𝑡
• Constants , , and determine the trajectory of the pandemic.
Herd Immunity
• Peak of the pandemic is when is maximum, or:
𝑑𝐼
= 𝛽 𝑆 𝐼 − 𝛾 𝐼 =0
𝑑𝑡

• At this time:
𝛾 1
𝑆= =
𝛽 𝑅0

If , there is no spread!
Estimating Parameter Values
Parameter Values from Data
• The SIR model has three parameters:
• How does one compute their values for a specific pandemic?
• Can it be done from the data about the pandemic?
• Yes, provided data is accurate.
• Let us see how to estimate .
Estimating in SIR Model
• Let denote fraction of new infections at time t.
• Then:
𝑁 =𝛽 𝑆 𝐼 = 𝛽 ( 1− 𝐼 − 𝑅 ) 𝐼 = 𝛽 𝐼 − 𝛽 ( 𝐼 + 𝑅 ) 𝐼
• Alternately:

1
𝐼= 𝑁+( 𝐼+𝑅) 𝐼
𝛽
Estimating in SIR Model
• Let denote respective actual numbers at time t where is the
population of region.
• Then:

^ 1 ^ 1
𝐼= 𝑁+ (^
𝐼+ ^ ^
𝑅) 𝐼
𝛽 𝑃0
Estimating in SIR Model
• This demonstrates a linear relationship between , , and .
• If these three quantities can be measured, then parameter can be
estimated.

The problem is that reported values of infections may differ greatly


from actual values.

That is why epidemiologists estimate parameter values using other


methods like studying virus properties, population dynamics, and
status of healthcare infrastructure.
COVID-19 Pandemic
• Different than earlier pandemics:
Has a large number of asymptomatic cases

• Most of these asymptomatic cases are not detected, and continue


passing infection to others.
• Nearly all cases with severe symptoms get detected.

Without detecting, how does one estimate asymptomatic cases?


COVID-19 Pandemic
• On the positive side, extensive data is available for the first time
about pandemic progression in different regions.

Can one create a model that allows using reported data to


estimate parameter values?
The SUTRA Model
Authors: M Agrawal (IITK), M Kanitkar (MUHS), M Vidyasagar (IITH)
SUTRA Model
• Group : infected but undetected population
• It will mostly consist of asymptomatic cases
• Group : infected and tested positive population
• Most of symptomatic cases will be in

Removed
Susceptible Undetected
Tested +ve Removed

A at the end stands for Approach


SUTRA Model
• Let , , , and represent fraction of population in each of five groups at
time .
𝑆 ( 𝑡 ) +𝑈 (𝑡 ) +𝑇 ( 𝑡 )+ 𝑅𝑈 ( 𝑡 ) + 𝑅 𝑇 ( 𝑡 )=1

• Dynamics:
𝑑𝑆
=− 𝛽 𝑆 𝑈 𝑑𝑈 𝑑 𝑅𝑈
𝑑𝑡
=𝛽𝑆𝑈 −𝑁 𝑇 −𝛾𝑈 =𝛾 𝑈
𝑑𝑡 𝑑𝑡
Adding and , and reduces to SIR model
SUTRA Model: Transition from U to T
• Standard way is to choose
• However, this is not a good choice since only recently infected move
from to
• Due to contact tracing protocol
• Also, analysis is hard
• Idea: is a better approximation of size of recently infected population
for constant .
• Hence, set
• This also makes the analysis very neat!
SUTRA Model: Analysis
• Compare equations for and :

𝑑𝑈
+𝛾 𝑈= 𝛽(1− 𝜖)𝑆𝑈
𝑑𝑡
• This gives:
^ 𝑈)
𝑑 (𝑇 − 𝜖
^ 𝑈),𝜖
=− 𝛾 ( 𝑇 − 𝜖 ^ =𝜖 /(1 − 𝜖)
𝑑𝑡
SUTRA Model: Analysis
• Therefore:
−𝛾𝑡
^
𝑇 =𝜖 𝑈 + 𝑎𝑒
• Thus quickly converges to .
• We also get, for a constant :

^ (𝑇 + 𝑅𝑇 )+ 𝑐
𝑈 + 𝑅𝑈 =1/ 𝜖

• Therefore, quickly converges to .


SUTRA Model: Analysis
• We have:

𝑁 𝑇 =𝜖𝛽𝑆𝑈
SUTRA Model: Analysis
• Resulting in:

1 1
𝑇= 𝑁𝑇 + (𝑇 +𝑅𝑇 ) 𝑇
𝛽 ( 1−𝜖 )( 1−𝑐 ) 𝜖 ( 1−𝑐 )
with and .

• Let as before.
SUTRA Model: Analysis
• This gives:

𝑒
𝑇^ =𝑏 𝑁
^ 𝑇 + ( 𝑇^ + ^𝑅𝑇 ) 𝑇^
𝑃0
Fundamental sutra of the model
SUTRA Model: Parameters

• : Contact rate, governs speed at which people get infected


• : Removal rate, governs speed at which infected people get removed
• : Mortality rate
• : Ratio of detected to total infections
• : Constant connecting to
Estimation of Parameters
• From the fundamental sutra:

^ ^ 𝑒 ^ ^ ^
𝑇 =𝑏 𝑁 𝑇 + ( 𝑇 + 𝑅𝑇 ) 𝑇
𝑃0

values of and can be calculated.


Connection with Reality?
• Does the model capture actual trajectory of Covid-19?
• Fortunately, it can be tested easily.
• Equation
^ ^ 𝑒 ^ ^ ^
𝑇 =𝑏 𝑁 𝑇 + ( 𝑇 + 𝑅𝑇 ) 𝑇
𝑃0
implies linear relationship over time between three known
quantities.

To verify, we plot against for suitably chosen


b = 3.86
India Data e = 39164
R2 = 0.998

March 23 – April 23, 2020


20000.0

15000.0
T–b*N

10000.0

5000.0

0.0
0.0 100000000.0 200000000.0 300000000.0 400000000.0 500000000.0 600000000.0 700000000.0 800000000.0 900000000.0

-5000.0

T * (T + RT)

India data is taken from www.covid19india.org


b = 6.38
India Data e = 917
R2 = 0.999

April 29 – June 20, 2020


140000.0

120000.0

100000.0

80000.0
T – b* N

60000.0

40000.0

20000.0

0.0
0.0 50000000000.0 100000000000.0 150000000000.0 200000000000.0 250000000000.0
-20000.0

T * (T + RT)
b = 6.29
India Data e = 165
R2 = 0.999

July 21 – August 21, 2020


1400000.0

1200000.0

1000000.0

800000.0
T–b*N

600000.0

400000.0

200000.0

0.0
0.0 2000000000000.0 4000000000000.0 6000000000000.0 8000000000000.0 10000000000000.0 12000000000000.0

T * (T + RT)
b = 6.68
India Data e = 82
R2 = 0.999

September 21 – November 1, 2020


2500000.0

2000000.0

1500000.0
T–b*N

1000000.0

500000.0

0.0
28000000000000.0 30000000000000.0 32000000000000.0 34000000000000.0 36000000000000.0 38000000000000.0 40000000000000.0

T * (T + RT)
b = 4.93
India Data e = 83.6
R2 = 0.999

November 12 – December 31, 2020


2500000.0

2000000.0

1500000.0
T–b*N

1000000.0

500000.0

0.0
18000000000000.0 20000000000000.0 22000000000000.0 24000000000000.0 26000000000000.0 28000000000000.0 30000000000000.0 32000000000000.0

T * (T + RT)
b = 4.29
India Data e = 83.1
R2 = 0.999

January 22 – January 31, 2021


900000.0

800000.0

700000.0

600000.0
T–b*N

500000.0

400000.0

300000.0

200000.0

100000.0

0.0
11000000000000.0 11500000000000.0 12000000000000.0 12500000000000.0 13000000000000.0 13500000000000.0

T * (T + RT)
b = 2.64
India Data e = 68.3
R2 = 0.999

March 22 – March 28, 2021


1400000.0

1200000.0

1000000.0

800000.0
T–b*N

600000.0

400000.0

200000.0

0.0
16000000000000.0 18000000000000.0 20000000000000.0 22000000000000.0 24000000000000.0 26000000000000.0 28000000000000.0

T * (T + RT)
b = 3.52
India Data e = 38.6
R2 = 0.999

April 24 – May 17, 2021


18000000.0

16000000.0

14000000.0

12000000.0
T–b*N

10000000.0

8000000.0

6000000.0

4000000.0

2000000.0

0.0
100000000000000.0 200000000000000.0 300000000000000.0 400000000000000.0 500000000000000.0 600000000000000.0

T * (T + RT)
Observations
The equation holds for ~62% days in the entire timeline

• There are eleven different phases with different values of b and e


e Phase 1 Phase 2 Phase 3 Phase 4 Phase 5 Phase 6 Phase 7 Phase 8 Phase 9 Phase 10 Phase 11
India 5759253 39164 917 165 81.9 83.6 83.1 68.3 38.7 37.8 33.1

Simulations of 26 countries, 35 states and UTs, and 500+ districts


of India show same phenomenon!
Questions

Why does value of and change?

Why does the relationship not hold for some days?

What is meaning of rapidly decreasing value of ?


Phase Changes
• Lockdowns, personal protection measures reduce
• Crowding and mutants increase
• Testing policies change
This can explain changes in , but not dramatic changes in

• It is reasonable to assume that parameter values drift for some time


after phase change and then stabilize.

This explains why does the relationship not hold for some days
Spread of Pandemic: A Relook
• Suppose the pandemic is spreading in China, but not in other
countries of Asia
• The population used for computing fractions would be of China, not
of Asia
• Then why should one use population of whole India when the
pandemic just started?
• At the time, it was confined to pockets of metro cities
• Therefore, even the population changes with time!
Spread of Pandemic: A Relook
• Let denote the population of region within reach of the pandemic at
time .
• Define , a new parameter that increases over time from to , where is
the total population of the region.
• Parameter is called reach of the pandemic.
Spread of Pandemic: A Relook
• The fundamental sutra changes to:

^ ^ 𝑒 ^ ^ ^
𝑇 =𝑏 𝑁 𝑇 + ( 𝑇 + 𝑅𝑇 ) 𝑇
𝑃0

with and .

Value of is large at beginning due to small value of and it


reduces as increases
Estimating All parameters
• How does one estimate and from and ?
• Define function as:
• On input , set and , compute and , and use it to compute trajectory of and
for current phase. Compare with and to estimate . Output .
• Value is a fix-point of .

Experimentally, it is found that has unique fix-point that can be found


quickly by iterating a small number of times from a random point.
India: Pandemic Spread

Detected New Infections (7 day average)


450,000

400,000

350,000

300,000
Infections

250,000

200,000

150,000

100,000

50,000

0
4/1/2020 7/10/2020 10/18/2020 1/26/2021 5/6/2021 8/14/2021 11/22/2021 3/2/2022

Date

Actual Data
India: Pandemic Spread Captured by Model
Detected New Infections (7 day average)
450,000

400,000

350,000

300,000
Infections

250,000

200,000

150,000

100,000

50,000

0
4/1/2020 7/10/2020 10/18/2020 1/26/2021 5/6/2021 8/14/2021 11/22/2021 3/2/2022

Date

Actual Data Model Computed Data


Calibrated with latest ICMR serosurvey

India: Parameter Values


Start Date Drift Period β 1/ϵ ρ (in %) We fix
Phase 1 03-03-2020 4 0.3 ±0.04 32 0 ±0
Phase 2 19-03-2020 5 0.32 ±0.01 32 ±0 0 ±0
Phase 3 16-04-2020 5 0.16 ±0 32 ±0 3.6 ±0.3
Phase 4 21-06-2020 25 0.16 ±0 32 ±0 20.8 ±1.9
Phase 5 22-08-2020 12 0.16 ±0 32 ±0 37.8 ±1.6
Phase 6 29-10-2020 10 0.19 ±0 32 ±0 42.3 ±0.8
Phase 7 18-12-2020 20 0.19 ±0 32 ±0 44.8 ±0.5
Phase 8 11-02-2021 35 0.4 ±0.01 32 ±0 46.2 ±0.8
Phase 9 30-03-2021 25 0.29 ±0 32 ±0 82.8 ±0.9
Phase 10 25-05-2021 0 0.28 ±0 32 ±0 85.9 ±2.6
Phase 11 20-06-2021 38 0.52 ±0 32 ±0 92 ±0.1
Phase 12 20-08-2021 2 0.6 ±0.01 32 ±0 92 ±0.4
Phase 13 01-11-2021 31 0.73 ±0.01 32 ±0 92.7 ±0.1
Phase 14 25-12-2021 11 1.54 ±0.17 32 ±0 104.5 ±2.9
Phase 15 10-01-2022 7 1.22 ±0.01 32.1 ±0 103.5 ±1
Loss and Gain of Immunity
Handling Loss of Immunity and Vaccination
• It can be proved that:

If fraction of population loses immunity and fraction gains immunity


due to vaccination, the new dynamics is captured in the model by
multiplying both and by .

• This is the reason why has gone above currently


Impact of Omicron

went up from 0.6 to 1.54, a 2.5x increase

went up from 0.92 to 1.05, increase of 0.13

It is likely that entire increase in is due to immunity loss


Remarks
Strengths and Weaknesses of the Model
• Strengths:
• Probably the first model that can estimate values of all parameters only from
daily reported infections and deaths data
• Can provide an excellent understanding of the past
• Can provide future projections up to medium term, assuming that parameters
do not change significantly
• Can incorporate effects of immunity-loss and vaccinations by changing
parameters and
• Can provide what-if analysis through setting parameters to different values
• Weaknesses:
• During drift period, estimating parameter values is difficult and so predictions
are likely to be wrong
• Cannot predict future values of parameters
Asking simple questions, and perusal of their answers often leads to
interesting discoveries!
Thank You

You might also like