Correlation1
Correlation1
Mathematical Exploration
TABLE OF CONTENTS:
1. Introduction
2. Background Information
4. Calculations
5. Concluding Findings
6. Bibliography
cle INTRODUCTION
My family and I have lived in countries such as India and South Africa where the availability
of clean water is scarce. Several years spent in both countries have led me to believe that
unclean water has a large effect on the well-being and health of people, especially those who
do not have a permanent home and the opportunity to receive clean water. I was privileged to
live in one of the capitals of South Africa where poverty wasn't as common and availability
of resources was easier to come by but as I would go to school and read the daily newspaper I
would see the struggle of those who lived in rural areas struggling to come across clean
water. Pregnant women who could not receive clean water to wash themselves, and children
who could not drink clean filtered water. It was sad to see but it was the reality of what was
occurring due to the lack of a necessary resource: clean water. Due to waterborne diseases,
children would catch diseases, and death at an early age was sadly common. India had a
similar case but it was more prominent, even in the capital of the country. Through my
exploration, I want to find out the extent of the struggles for clean water in 50 randomly
selected countries and find the correlation between it and the mortality rate of the population
of the country. My hypothesis is that there is a correlation between the percentage of clean
BACKGROUND INFORMATION
The mortality rate, often known as the death rate, is a measure of the number of deaths in a
given population per unit of time.1The mortality rate is a metric that measures the frequency
of deaths in a given population over a given time period. The denominator used to calculate
mortality is, in theory, the population's average number over time. The impact of a certain
1 "Health Indicators Related to Disease, Death, and Reproduction - PMC." 23 Jan. 2019,
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6378386/. Accessed 30 Nov. 2023.
disease on a specific population may be studied using mortality rates. 2 Factors that contribute
to mortality rate include exhibiting characteristics of negative behavior such as smoking and
lack of exercise as well as environmental factors such as air pollution, contaminated water,
and pathogens. Diseases such as stroke, brain trauma, and Alzheimer's are among the leading
causes of death.
Total Population
The need for clean water is significant. Water is a basic human resource needed for survival,
both for hydration and sanitation. Diseases are oftentimes spread through water as waterborne
diseases such as giardiasis, dysentery, typhoid fever, E. Coli infection, and salmonellosis.
Using a correlational study I will be investigating the relationship between the percentage of
people with availability of water with the mortality rate. I will be using a systematic sample
where I will be obtaining data by choosing every 6th country from The World Data bank in a
list of countries in alphabetical order so as to have a range of data and be statistically accurate
which will guide me in measuring the linear correlation between my two chosen sets of data.
I will be creating box graphs and scatter plots to show the limitations and strengths of my two
2 "Causes of Death - Our World in Data." https://ourworldindata.org/causes-of-death. Accessed 30
Nov. 2023.
variables. Through this, I will be able to answer my research question “Is there a correlation
Table 1: Percentage of population with clean water and rate of mortality in 60 countries
Argentina 51 75
Bolivia 53 64
Bahrain 91 79
Brazil 49 73
Chad 10 53
Croatia 68 76
Channel Islands 82 81
Cuba 37 74
Ecuador 42 72
Estonia 93 77
France 79 82
Guinea-Bissau 12 60
Greenland 92 71
Iceland 84 83
India 46 67
Jordan 82 74
Kuwait 100 79
Lebanon 16 75
Libya 22 72
Malta 92 83
Myanmar 61 66
Montenegro 45 74
Nigeria 31 53
North Macedonia 12 75
Philippines 61 69
Puerto Rico 33 80
Spain 96 83
Libya 22 72
Malta 92 83
Myanmar 61 66
Montenegro 45 74
Nigeria 31 53
North Macedonia 12 75
Philippines 61 69
Puerto Rico 33 80
Spain 96 83
Sweden 95 83
Slovenia 72 81
Somalia 32 55
Tanzania 26 66
Tuvalu 6 65
United Arab Emirates 99 79
Ukraine 72 70
Venezuela 23 71
Yemen 19 64
Zimbabwe 26 59
Algeria 18 76
Albania 48 76
Bulgaria 72 72
Bhutan 65 72
Chile 79 79
Czechia 85 77
Cyprus 77 81
Djibouti 37 62
Ethiopia 7 65
Ghana 13 643
Minimum value 6 53
Q1 27.25 66.25
Median 52 73.5
Q3 79 79
Maximum 100 83
to the right indicating that the majority of my data is located on the left side of the box.
100-6=94
IQR: Q3-Q1
79-27.25=51.75
The outliers:
The upper boundary: Q3+ 1.5 * IQR= 156.625
As the value of my median is larger than that of my mean, the shape of my graph is skewed
to the left, indicating that most of my data is located on the right side of the box.
83-53= 30
IQR : Q3-Q1
79-66.25=10.75
The outliers:
I have calculated the Pearsons Correlation Coofiecient using the TI-Nspire calculator and
r=0.671
4
A strong positive relationship is indicated by a correlation coefficient greater than zero,
while a negative relationship is indicated by a value less than zero. More specifically, a
Correlation coefficient between -0.5 and 0.5 generally indicates a very weak to moderate
correlation, while values beyond that range suggest a stronger correlation. As my Correlation
coefficient may be rounded to 0.7, it suggests that a strong correlation is present between my
5
This method has the following advantages: it is easier to interpret It generates data with
improved statistical properties. However, the main disadvantage of this method is that when
CONCLUDING FINDINGS
Through the research that I have gone through for mortality rate for many countries. There
was the presence of a major environmental concern which was contaminated water. This
issue is much more prevalent in countries with scarce clean water resources but is an
important factor in every country's mortality rate. I was initially doing the research for
specifically India and South Africa as I have a personal relationship with them. I wanted to
investigate more of the countries, with both scarce water resources as well as for countries
with more developed water resources. I had faced a limitation in the beginning of my
be fully executed as there was no data available for the percentage of clean accessible water
in some countries. Therefore, I have obtained data from the following country in the list, that
being the 6th country. I went through the list and resources twice to gather a sufficient
amount of data for each country as well as to have my desired number of countries to
investigate. An additional limitation was that due to the fact that insufficient statistics were
gathered for certain countries, an equal ratio of countries with and without clean water
resources was not present which is another factor that needed to be considered.
suggest a positive correlation between the percentage of people with clean drinking water and
the mortality rate of that country. By the analysis of both my box graphs, my results appear to
be distributed throughout the graph which has led to a greater dispersion of data Through my
investigation and the creation of the box graphs for each variable and their scatter graphs I
have observed the absence of any outliers. According to the value that was produced by the
BIBLIOGRAPHY
Investopedia, www.investopedia.com/ask/answers/032515/what-does-it-mean-if-
“The Disadvantages of the Pearson R Correlation Method Are It Assumes That There -
www.coursehero.com/file/p4jtjlo/The-disadvantages-of-the-Pearson-r-correlation-
method-are-It-assumes-that-there/.