0% found this document useful (0 votes)
40 views

Course Project 2: Impact of Severe Weather Events On Health and The Economy in The US

The document analyzes severe weather events in the US using NOAA storm data from 1995-2011. It finds that: 1) Floods have the most adverse economic impact, causing over $7 billion in damages, while tornadoes have the most adverse health impact, causing over 23,000 injuries or fatalities. 2) The top 10 events causing health impacts are excessive heat, flash floods, floods, heat, hurricanes, lightning, thunderstorm winds, tornadoes, and winter storms. 3) The top events causing economic costs are floods, hurricanes, hail, tornadoes, and drought, responsible for billions in property and crop damages.

Uploaded by

harshit tygai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Course Project 2: Impact of Severe Weather Events On Health and The Economy in The US

The document analyzes severe weather events in the US using NOAA storm data from 1995-2011. It finds that: 1) Floods have the most adverse economic impact, causing over $7 billion in damages, while tornadoes have the most adverse health impact, causing over 23,000 injuries or fatalities. 2) The top 10 events causing health impacts are excessive heat, flash floods, floods, heat, hurricanes, lightning, thunderstorm winds, tornadoes, and winter storms. 3) The top events causing economic costs are floods, hurricanes, hail, tornadoes, and drought, responsible for billions in property and crop damages.

Uploaded by

harshit tygai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Course Project 2

17/05/2020

Impact of Severe Weather Events on health and the economy in


the US

Synopsis

The study investigates severe weather events in the US using the NOAA Storm Database (U.S. National
Oceanic and Atmospheric Administration). The database provides information on the severe weather events,
injuries, fatalities, and crop and property damages which shall be examined. The analysis utilises that
information and shows that floods have the most adverse economic impact while tornadoes have the most
adverse health impact.

Data Processing

Load requisite libraries

library(lubridate)
library(dplyr)
library(ggplot2)
library(RColorBrewer)
library(R.utils)

Read data after downloading and unzipping the zip file in the current working directory

if (!file.exists("./StormData.csv")) bunzip2("./StormData.csv.bz2")
if (!("stormdata" %in% ls())) stormdata <- read.table("StormData.csv",
stringsAsFactors = FALSE, sep = ",", header = TRUE)

Analyse the structure of data. There are 902,297 observations of 37 variables. The ones of use for analysis on
economic damage and adverse health effects are BGN_DATE (date), EVTYPE (event type), FATALITIES
(number of fatalities), INJURIES (number of injuries), CROPDMG, PROPDMG (monetary impact on crop
and property),CROPDMGEXP, PROPDMGEXP (associated exponenents of CROPDMG and PROPDMG)

str(stormdata)

## ’data.frame’: 902297 obs. of 37 variables:


## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ..
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...

1
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...

Add a year column to the data

if (ncol(stormdata) == 37) stormdata$year <- year(mdy_hms(stormdata$BGN_DATE))

Use year as a factor to find number of observations every year to find out which data to use

summary(as.factor(stormdata$year))

## 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962
## 223 269 272 492 609 1413 1703 2184 2213 1813 1945 2246 2389
## 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975
## 1968 2348 2855 2388 2688 3312 2926 3215 3471 2168 4463 5386 4975
## 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988
## 3768 3728 3657 4279 6146 4517 7132 8322 7335 7979 8726 7367 7257
## 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001
## 10410 10946 12522 13534 12607 20631 27970 32270 28680 38128 31289 34471 34962
## 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
## 36293 39752 39363 39184 44034 43289 55663 45817 48161 62174

From 1995, the number of observations is consistently above 25000, so I limit the analysis of data to the
years between 1995 and 2011.

2
stormdata <- filter(stormdata, year >= 1995)

Processing data for analysis of Adverse Effects on Health

Event type, fatalities, and injuries are selected from the updated data (1995 - 2011). To analyse impact on
health, fatalities and injuries are clubbed together by addition to form a net_impact variable which is then
filtered to remove events with no impact.

health <- select(stormdata, EVTYPE, FATALITIES, INJURIES)


health$net_impact <- health$FATALITIES + health$INJURIES
health <- filter(health, net_impact > 0)
health <- health[, -c(2, 3)]

Health impact data frame is created which is the net_impact across all years.

health_impact <- with(health, aggregate(net_impact ~ EVTYPE,


FUN = sum))

A subset of 10 rows of highest net_impact values is extracted

subset(health_impact, health_impact$net_impact > quantile(health_impact$net_impact,


prob = 1 - 10/nlevels(as.factor(health_impact$EVTYPE))))

## EVTYPE net_impact
## 29 EXCESSIVE HEAT 8428
## 39 FLASH FLOOD 2668
## 42 FLOOD 7192
## 59 HEAT 2954
## 88 HURRICANE/TYPHOON 1339
## 100 LIGHTNING 5360
## 143 THUNDERSTORM WIND 1557
## 151 TORNADO 23310
## 156 TSTM WIND 3871
## 178 WINTER STORM 1493

We find that HURRICANE/TYPHOON and TSTM WIND are wrongly named using the Storm Data
Documentation. This may have resulted in an error while finding highest net_impact values, so we recompute
after changing those rows.

health$EVTYPE[health$EVTYPE == "TSTM WIND"] <- "THUNDERSTORM WIND"


health$EVTYPE[health$EVTYPE == "HURRICANE/TYPHOON"] <- "HURRICANE (TYPHOON)"
health_impact <- with(health, aggregate(net_impact ~ EVTYPE,
FUN = sum))

We compute the subset of 10 rows of highest net_impact values and assign it to worst_impact.

worst_impact <- subset(health_impact, health_impact$net_impact >


quantile(health_impact$net_impact, prob = 1 - 10/nlevels(as.factor(health$EVTYPE))))

3
Processing data for analysis of economic consequences

Requisite variables for analysis of economic costs are assigned to the economic variable, which is then filtered
out to keep only those events that have some property or crop damage.

economic <- select(stormdata, EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG,


CROPDMGEXP)
economic <- filter(economic, PROPDMG > 0 | CROPDMG > 0)

The exponents for crop and property damage are changed to numeric values instead of characters like “B”
representing billion and the like.

economic$CROPDMGEXP[tolower(economic$CROPDMGEXP) == "b"] <- 10^9


economic$CROPDMGEXP[tolower(economic$CROPDMGEXP) == "m"] <- 10^6
economic$CROPDMGEXP[tolower(economic$CROPDMGEXP) == "k"] <- 10^3
economic$CROPDMGEXP[tolower(economic$CROPDMGEXP) == "0"] <- 10^0
economic$CROPDMGEXP[tolower(economic$CROPDMGEXP) == "?"] <- 10^0
economic$CROPDMGEXP[tolower(economic$CROPDMGEXP) == ""] <- 10^0

economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "b"] <- 10^9


economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "7"] <- 10^7
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "m"] <- 10^6
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "6"] <- 10^6
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "5"] <- 10^5
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "4"] <- 10^4
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "k"] <- 10^3
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "3"] <- 10^3
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "h"] <- 10^2
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "2"] <- 10^2
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "0"] <- 10^0
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "-"] <- 10^0
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == "+"] <- 10^0
economic$PROPDMGEXP[tolower(economic$PROPDMGEXP) == ""] <- 10^0

economic$PROPDMGEXP <- as.numeric(economic$PROPDMGEXP)


economic$CROPDMGEXP <- as.numeric(economic$CROPDMGEXP)

A net_cost variable is created which is the sum of damage to property and crop.

economic <- within(economic, net_cost <- PROPDMG * PROPDMGEXP +


CROPDMG * CROPDMGEXP)
economic <- economic[, -(2:5)]

A possible error that cropped up in the health processing is removed

economic$EVTYPE[economic$EVTYPE == "TSTM WIND"] <- "THUNDERSTORM WIND"


economic$EVTYPE[economic$EVTYPE == "HURRICANE/TYPHOON"] <- "Hurricane (Typhoon)"

An economic consequence variable is formed that sums up the net economic cost over the years due to various
events.

4
economic_conseq <- with(economic, aggregate(net_cost ~ EVTYPE,
FUN = sum))

The rows containing the top 10 values of net_cost alone are extracted out and assigned to worst_economic_conseq.

worst_economic_conseq <- subset(economic_conseq, economic_conseq$net_cost >


quantile(economic_conseq$net_cost, prob = 1 - 10/nlevels(as.factor(economic_conseq$EVTYPE))))

Results

Adverse Health Impact

Colours are prepared for bar graph

mycolours <- colorRampPalette(brewer.pal(9, "Reds"))(nrow(worst_impact) +


4)[5:(4 + nrow(worst_impact))]
names(mycolours) <- levels(with(worst_impact, reorder(EVTYPE,
net_impact)))

A bar chart with net health impact against event type is plotted and it is found that tornadoes have the
most adverse effect on health by far followed by excessive heat and floods.

ggplot(worst_impact, aes(x = reorder(EVTYPE, -net_impact), y = net_impact)) +


geom_bar(stat = "identity", aes(fill = EVTYPE, color = EVTYPE)) +
theme_classic() + theme(axis.text.x = element_text(angle = 90),
legend.position = "none") + scale_fill_manual(values = mycolours) +
scale_color_manual(values = mycolours) + ylab("Total Number of Fatalities and Injuries") +
xlab("Event") + ggtitle("Fatalies and Injuries Caused by Severe Weather Events")

5
Total Number of Fatalities and Injuries Fatalies and Injuries Caused by Severe Weather Events

20000

15000

10000

5000

HURRICANE (TYPHOON)
THUNDERSTORM WIND
EXCESSIVE HEAT

WINTER STORM
FLASH FLOOD
LIGHTNING

HIGH WIND
TORNADO

FLOOD

HEAT

Event

Impact on the Economy

Colours are prepared for bar graph

econ_colours <- colorRampPalette(brewer.pal(9, "YlOrRd"))(nrow(worst_economic_conseq))


names(econ_colours) <- levels(with(worst_economic_conseq, reorder(EVTYPE,
net_cost)))

A bar chart with net economic cost against event type is plotted and it is found that floods cause the most
economic damage by far, followed by Hurricane (Typhoon) and Storm Surges

ggplot(worst_economic_conseq, aes(x = reorder(EVTYPE, -net_cost),


y = net_cost)) + geom_bar(stat = "identity", aes(fill = EVTYPE,
color = EVTYPE)) + theme_classic() + theme(axis.text.x = element_text(angle = 90),
legend.position = "none") + scale_fill_manual(values = econ_colours) +
scale_color_manual(values = econ_colours) + ylab("Total Economic Cost") +
xlab("Event") + ggtitle("Economic Cost of Severe Weather Events")

6
Total Economic Cost

0.0e+00
5.0e+10
1.0e+11
1.5e+11

FLOOD

Hurricane (Typhoon)

STORM SURGE

TORNADO

7
HAIL

Event
FLASH FLOOD
Economic Cost of Severe Weather Events

DROUGHT

HURRICANE

THUNDERSTORM WIND

TROPICAL STORM

You might also like