Chapter 4_mathematics as a Tool
Chapter 4_mathematics as a Tool
AS A TOOL
DATA MANAGEMENT
Is the development, execution and
supervision of plans, policies, programs and
practices that control, protect, deliver and
enhance the value of data and information
assets.
It is an administrative process by which the
data is acquired, validated, stored,
protected and processed.
objectives
1.Use a variety of statistical tools to process and manage numerical
data;
Graphical form
Shows numerical values or relationship in pictorial
form. It makes use of graphs, symbols or visual aids.
how is tabular presentation should be presented?
Simple - It should be straight forward and not loaded with irrelevant, or trivia symbols and
ornamentation.
Clear - It should be easy to read and understood. There should be truthful and unambiguous
representation of facts.
Attractive - It should be stylish to attract and hold the attention by holding a neat, dignified,
and professional appearance.
There are various graphs we can use to
present data, these are:
Line graph - They are ideal for visualizing how a Bar graph - It consist of regular bars where
variable changes over a period of time. the height of bars represents quantity of
frequency for each category
Year
2010
2005
2000
1995
1990
1985
1980
1975
1970
1965
1960
K = 1 + 3.322 Log N
where n = total of values to be grouped
3. determine the class size c
C=R/K
4. determine the lower limit
of the first class
5. Construct the class intervals
and determine the class
frequencies
FREQUENCY HISTOGRAM
A set of vertical bars whose areas are
proportional to the frequencies presented
FREQUENCY POLYGON
Is a line chart plotted along the same scale as
the histogram. The class frequency is plotted
against the class mark.
data analysis and
interpretation
data analysis and
interpretation
the measure of tendency measures
of dispersion and measure of
skewness and kurtosis
estimation of parameters (s) and
the hypothesis testing
measure of central
tendency
are measures indicating the
center of set of data which
are arranged in order of
magnitude
mean or arithmetic mean
(or average)
The most popular and well
known measure of a central
tendency.
mean or arithmetic mean
(or average)
=
weighted mean
Properties of mean
The sum of the deviations of the
observations from the mean is
zero. The mean is devoted by
WHAT IS
M e d i a n ??
Median
The median is defined as the middle value when a
set of observed values have been arranged in either
ascending (from lowest to highest value) or
descending (from highest to lowest value) order of
magnitude.
e x a m p l e:
Median
• The median is the centermost array into two equal
parts, that is 50% of the total number of
observation is less than the median value while the
other 50% is greater than the median value.
Md
properties of median
• The median is not affected by all of the data
values in a dataset.
m o d e ??
mode
• The most frequent score in the data set. It is
sometimes considered as the most popular option.
1, 2, 2, 2, 8, 1, 4, 10.
The most frequently occurring observation is 2
which appeared thrice. Thus, the mode is 2, and
since there is only one mode, then the distribution is
unimodal.
example:
Suppose BS Applied Statistics has 10 students and
the height (in cm) are as follows: 170, 165, 155, 160,
150, 149, 152, 161, 163, 175.
— — — — —
——
The computational formula of the sample variance is,
The computational formula of the population variance is
2. The larger the value of the variance the more dispersed are the
observations;
The computational formula of the population variance is
3. The variance can be easily manipulated;
Properties of
Properties of Standard
Standard Deviation
Deviation
The properties of standard deviation have the same
properties with the variance.
4. coefficient of Variation
Is a measure or a creation
on how asymmetric the
distribution of data is from
the mean.
5. Skewness
Using Measures of Central Tendency
• If Mean = Median = Mode, the skewness is zero. (Symmetrical)
• If Mean > Median > Mode, the skewness is positive.
• If Mean < Median < Mode, the skewness is negative.
The formula for the coefficient of the
Pearsonian skewness, denoted by SK, is
where
- Pearsonian Coefficient of Skewness
- the mean
- the median
- the standard deviation
using the measure of central tendency, tell whether the given data
are symmetric, skewed to the left, or skewed to the right.
Mean = 6 Median = 7 Mode = 8
Since the Mean < Median < Mode, therefore it is
negatively skewed
Example:
The following data represent the score of 7BS Applied Statistics in a quiz:
X = 4, X = 7, X = 8, X = 2, X = 2, X = 2, X = 9, X = 3
Solution: Md = 4 =5 = 2.73
3(5 - 4)
SK = = 1.0989 = 1.10
2.73
Hence, positively skewed distribution
6. Coefficient of Kurtosis
Kurtosis measures the flatness and
peakedness of the distribution of a
given data set. It also measures the
degree of departure from the
normal distribution
6. Coefficient of Kurtosis
Positive Kurtosis
Leptokurtic
Negative Distribution
Normal Distribution
Platykurtic Mesokurtic
The coefficient of The coefficient of sample
population kurtosis is kurtosis is denoted by K
denoted by K and is given and is given by
by
4 4
Xi Xi X
K= K=
2 2 2 2
( ) ( )
and the coefficient of kurtosis for group denoted
by K 6 is given by
44
Xi G )
K=
2 2
( )
If K < 3, then the distribution is Platykurtic
K > 3, then the distribution is Leptokurtic
K = 3, then the distribution is Mesokurtic
Example:
The following data represent the score of 7BS Applied Statistics in a quiz:
X = 4, X = 7, X = 8, X = 2, X = 2, X = 9, X = 3
Solution:
= (4-5)⁴+ (7-5)⁴+ (8-5)⁴+ (2-5)⁴+ (2-5)⁴+ (9-5)⁴+ (3-5)⁴= 532
532 / 7
K= = 0.04115 = 0.04
(2.73²)²
f(x) = σπ
The computational formula of the population variance is
=e
for - ∞ < x < ∞ and for constants μ and σ, where - ∞ < μ < ∞, σ > 0 and e≈2.71828
and π≈3.14159.
Standard Normal Distribution
The computational formula of the population variance is
- A correlation is a relationship or
association between two variables.
correlation coefficient
- The linear correlation coefficient is a
number calculated from given data
that measures the strength of the
The computational formula of th
variables: x and y.
PEARSON PRODUCT MOMENT
COEFFICIENT
- Is a measure of the linear relationship
The computational formula of the population variance is
The computational formula of the sample variance is,The computational formula of the population variance is
Contingency Coefficient
The contingency coefficient
is the most and oldest
measure of association
based on the chi-square
statistics.
Its formula is given by where X2 the
chi-square statistics n - total frequency
x²
C=
x²+n
where X² the chi-square statistics n - total frequency
Cramer's V Coefficient
The Cramer's coefficient is a
measure of the degree of
association or relationship
between two sets of nominal
data.
Its formula was based from the chi-square
statistics and is given by
x²
C=
n (L-1)
where X² the chi-square statistics, n - total frequency
and L is the minimum of number of rows and columns
Example:
A study was conducted whether there is significant association
between highest educational attainment of father/mother and the
number of siblings. A sample of 525 households were randomly
selected with the following results:
Example:
Testing for the Significance of Cramer's V Coefficient
1. Ho: there is no significant association between educational
attainment and number of children
Ha: there is a significant association between educational
attainment and number of children
2. Level of significance a= 0.05 and sample size n= 525
3. Test Statistics : Chi-square test
4. Critical Region : Reject Ho if X²c > 9.488
Example:
5. Computations : Compute the Chi-square test statistic using the
formula X²c = {(50-64)2/64} + {(25-39)2/39} + {(60-31)2/31}
+ {(70 -95)2/95} + {(88-58)2/58} + {(42-46)2/46}
+ {(138-90)2/90} + {(40-55)2/55} + {(20-44)2/44}
X²c = 91.54
6. Decision : Since X²c < 9.488, therefore reject Ho and conclude that
there is a significant association bwtween educational attainment
and number of children in the household
The degree of association is computed
using the formula of contingency
coefficient is
x²
C=
x²+n
C = 0.385329 = 0.385
The degree of association is computed
using the formula of Cramer's V Coefficient
is
x²
C=
n (L-1)
C = 0.29526401 = 0.295
THANK
YOU!