0% found this document useful (0 votes)

29 views

Statistics

Uploaded by

Ali Allam

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

Statistics

Uploaded by

Ali Allam

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 135

Lebanese University

Faculty of Pedagogy 1, Unesco

Dr. Ali Allam

1
Statistics and Sports

Statistics and Evaluation

2
Outline
o Chapter 1: Introduction to Statistics.

o Chapter 2: Frequency distribution and graphical

representations.

o Chapter 3: Measuring centers.

o Chapter 4: Measures of spread.

o Chapter 5: Bivariable distribution.

o Chapter 6: Events and probability.

o Chapter 7: Standard normal distribution.

3
Chapter 1

Introduction to Statistics

4
Outline
1. Definition.

2. Inferential and descriptive statistics.

3. Population and sample.

4. Individual and variable.

5. Qualitative and quantitative variable.

6. Discrete and continuous variable.

5
1. Definition
o Statistics is the science of data.

o It involves collecting, classifying, summarizing,

organizing, analyzing, and interpreting numerical
information.

o It is used in several different disciplines.

o It is used to make decisions and draw conclusions

based on data.

o There are two types of statistics: Inferential and

descriptive. 6
2.1. Inferential Statistics

o Inferential statistics utilizes sample data to make

estimates, decisions, predictions, or other
generalizations about a larger set of data.

o One of the most commonly used inferential

techniques is hypothesis testing.

o The main goal of inferential statistics is drawing

conclusions.

7
2.2. Descriptive Statistics

o Descriptive statistics utilizes numerical and

graphical methods to look for patterns, to
summarize and to present the information revealed
in a data set.

o The class of descriptive statistics includes both

numerical measures (mean, median, …) and
graphical displays of data (bar graph, pie chart, …).

o The main goal of descriptive statistics is to describe

a data set.
8
2.2. Descriptive Statistics

A pie chart A bar graph

9
3. Population and Sample
o A population is the entire collection of events in
which we are interested. It can be of any size.

o A sample is a subset of the units of a population, and

is typically smaller than the population.

o In a statistical study, all elements of a sample are

available for observation, which is not typically the
case for a population.
10
3. Population and Sample

11
4. Individual and Variable
o Any set of data contains information about some
group of individuals. The information is organized in
variables (or data).

o Individuals are the objects described by a set of data.

Individuals may be people, but they may also be
animals or things.

o A variable is any characteristic of an individual. A

variable can take different values for different
individuals.
12
5. Qualitative or Quantitative variable
o Qualitative (or categorical) data are measurements
for which there is no natural numerical scale.
o Examples of a qualitative data: color (green, blue,
yellow, …), gender (male, female, …) , size (small,
medium, large, …).

o Quantitative data are numerical measurements that

arise from a natural numerical scale. They contain
numbers.
o Examples of a quantitative data: age (10, 15, …),
height (150 cm, 160 cm, …), weight (70 kg, 80 kg, …).
13
6. Discrete and Continuous variable
o A quantitative data can be discrete or continuous.

o A discrete variable can only take certain values.

o Examples of a discrete variable: 12; 15; 16; …

o A continuous variable can take any value within a

given range (Example: interval).
o Examples of a continuous variable: 0-5; [5-10[; …

14
Applications
1. Identify each of the following data sets as either a
population or a sample:

* The grade point averages (GPAs) of all students at a

college: Population.

* The GPAs of a randomly selected group of students on

a college campus. Sample.

15
Applications
2. Identify the following measures as either
quantitative or qualitative:

* The genders of the first 40 newborns in a hospital one

year: Qualitative.
* The natural hair color of 20 randomly selected
fashion models: Qualitative.
* The ages of 20 randomly selected fashion models:
Quantitative.
* The political affiliation of 500 randomly selected
voters: Qualitative.

16
Applications
3. A researcher wishes to estimate the average weight
of newborns in South America in the last five years. He
takes a random sample of 235 newborns and obtains
an average of 3.27 kg.

* What is the population of interest? All newborn babies

in South America in the last five years.
* What is the parameter of interest? The average birth
weight of all newborn babies in South America in the last
five years.
* Based on this sample, do we know the average weight
of newborns in South America? Explain. No, not exactly,
but we know the approximate value of the average. 17
Applications
4. A sociologist wishes to estimate the proportion of all
adults in a certain region who have never married. In a
random sample of 1,320 adults, 145 have never
married, hence 145/1320 = 0.11 or about 11% have
never married.

* What is the population? All adults in the region.

* What is the parameter of interest? The proportion of
the adults in the region who have never married.
* Based on this sample, do we know the proportion of
all adults who have never married? Explain. No, not
exactly, but we know the approximate value of the
proportion. 18
Lebanese University

Faculty of Pedagogy 1, Unesco

Dr. Ali Allam

19
Chapter 2

Frequency Distribution and

Graphical Representations

20
Outline
1. Frequency and relative frequency.

2. Cumulative frequency.

3. Percentage.

4. Graphical representations for a qualitative variable.

5. Graphical representations for a discrete variable.

6. Graphical representations for a continuous variable.

21
1. Frequency and relative frequency
o The frequency ‘ni’ of a particular observation is the
number of times the event appears in the data.

o The total frequency ‘N’ is the sum of the frequencies.

N
N   ni
i 1

o The relative frequency ‘fi’ is:

ni ni
fi  
N
and F   fi  1
n
N
i
i 1

22
2. Cumulative Frequency

o It is obtained by adding each value to the sum of the

proceeding values.

o Ni ↑: Increasing cumulative frequency.

o Ni ↓: Decreasing cumulative frequency.

o Fi ↑: Increasing cumulative relative frequency.

o Fi ↓: Decreasing cumulative relative frequency.

23
2.1. Applications
o Let the event Xi the age of 11 players in Al Ahed
football club. Calculate fi, Ni ↑, Ni ↓, Fi ↑ and Fi ↓.
Xi 20 22 25 27 30 Total

ni 2 2 3 0 4 11

fi 2/11 2/11 3/11 0/11 4/11 1

Ni ↑ 2 4 7 7 11

Ni ↓ 11 9 7 4 4

Fi ↑ 2/11 4/11 7/11 7/11 1

Fi ↓ 1 9/11 7/11 4/11 4/11

24
2.2. Applications
o Let the event Xi the height in cm of 12 players in Al
Riyadi basketball club. Calculate fi, Ni ↑, Ni ↓, Fi ↑ and
Fi ↓.
Xi 185-190 190-195 195-200 200-205 205-210 Total

ni 3 2 4 1 2 12

fi 3/12 2/12 4/12 1/12 2/12 1

Ni ↑ 3 5 9 10 12

Ni ↓ 12 9 7 3 2

Fi ↑ 3/12 5/12 9/12 10/12 1

Fi ↓ 1 9/12 7/12 3/12 2/12

25
3. Percentage

o The percentage % is the frequency times 100.

ni
%  pi   100  f i  100
N

o The percentage is the most usable formula for

drawing conclusions.

26
4. Graphical representations for a
qualitative variable.
o In general, a qualitative variable can be graphically
represented by:

1. Pie-Chart and Semi Pie-Chart.

2. Bar graph (or bar chart).

27
4.1. Pie-Chart and Semi-Pie Chart.
o To draw a pie chart, we must calculate the angle ‘αi’ of
each event Xi.
o Every sector represents an event.

o For a Pie-Chart:
ni
i   360  f i  360
N
o For a Semi Pie-Chart:

ni
i   180  f i  180
N
28
4.1. Applications.
o The number of winning the world cup for 4 countries
is given in the table below. Draw the pie-chart.
Xi Italy Argentina Brazil Germany Total
ni 4 2 5 4 15

αi 96° = (4/15x360) 48° 120° 96° 360°

29
4.2. Bar graph (or Bar chart).
o The height of every bar corresponds to the frequency.

o The width of the bar has no signification.

30
4.2. Applications.
o The number of winning the world cup for 4 countries
is given in the table below. Draw the bar graph.
Xi Italy Argentina Brazil Germany Total

ni 4 2 5 4 15

31
5. Graphical representations for a
discrete variable.
o In general, a discrete variable can be graphically
represented by:

1. Pie-Chart and Semi Pie-Chart.

2. Bar graph (or bar chart).

3. Polygon of frequency ni (or relative frequency fi).

4. Polygon of cumulative frequency (Ni↑, Ni↓, Fi ↑, Fi↓).

32
5.1. Pie-Chart and Semi-Pie Chart.
o Same as for qualitative data.

o Example: Draw the semi-Pie chart for the following

data. Xi represents the grades over 20 in statistics for
10 students in a class.
Xi 5 10 15 18 Total
ni 3 2 4 1 10

αi 54° = (3/10x180) 36° 72° 18° 180°

33
5.2. Bar graph (or Bar chart).
o The height of every bar corresponds to the frequency.

o The width of the bar has no signification.

o The form of a bar graph for a discrete variable is given

below.

ni n
i N fi f i 1

n3 f3
n2 f2
n1 f1
x1 x2 x3 xm x1 x2 x3 xm
34
5.2. Applications.
o Example: Draw the bar graph for the following data. Xi
represents the grades over 20 in statistics for 10
students in a class.
Xi 5 10 15 18 Total
ni 3 2 4 1 10

35
5.3. Polygon of frequency ni (or fi).
o Example: Draw the polygon of frequency ni for the
following data. Xi represents the grades over 20 in
statistics for 10 students in a class.
Xi 5 10 15 18 Total
ni 3 2 4 1 10

36
5.4. Polygon of cumulative frequency.
o Example: Draw the polygon of frequency ni for the
following data. Xi represents the grades over 20 in
statistics for 10 students in a class.

Xi 5 10 15 18 Total
ni 3 2 4 1 10

Ni ↑ 3 5 9 10

Ni ↓ 10 7 5 1

Fi ↑ 3/10 5/10 9/10 1

Fi ↓ 1 7/10 5/10 1/10

o The form of Ni↑ is the same as for Fi ↑ , and the form of

Ni↓ is the same as for Fi↓. 37
5.4. Applications.
o The polygon of cumulative frequencies are given
below: a) Polygon of Ni↑ and b) Polygon of Ni↓.

Polygon of Ni↑ Polygon of Ni↓

38
6. Graphical representations for a
continuous variable.
o In general, a discrete variable can be graphically
represented by:

1. Pie-Chart and Semi Pie-Chart.

2. Histogram.

3. Polygon of frequency ni (or relative frequency fi).

4. Polygon of cumulative frequency (Ni↑, Ni↓, Fi ↑, Fi↓).

39
6.1. Pie-Chart and Semi-Pie Chart.
o Same as for qualitative data.

o Example: Draw the Pie chart. Xi represents the weight

in Kg for 20 newborns in a hospital in Beirut.
Xi [0-1 [ [1-2[ [2-3[ [3-4[ Total
ni 1 5 8 6 20

αi 18° = (1/20x360) 90° 144° 108° 180°

40
6.2. Histogram.
o The height of every bar corresponds to the frequency.

o The width of the bar corresponds to the width of the

interval.

o No space exist between the bars.

41
6.3. Polygon of frequency ni (or fi).
o Example: Draw the polygon of frequency of the
following data. Xi represents the weight in Kg for 20
newborns in a hospital in Beirut.
Xi [0-1 [ [1-2[ [2-3[ [3-4[ Total
ni 1 5 8 6 20

o We join the midpoints of each class interval.

42
6.4. Polygon of cumulative frequency.
o Example: Draw the polygon of frequency of the
following data. Xi represents the weight in Kg for 20
newborns in a hospital in Beirut.
Xi [0-1 [ [1-2[ [2-3[ [3-4[ Total
ni 1 5 8 6 20

Ni ↑ 1 6 14 20

Ni ↓ 20 19 14 6

Fi ↑ 1/20 6/20 14/20 1

Fi ↓ 1 19/20 14/20 6/20

o The form of Ni↑ is the same as for Fi ↑ , and the form of

Ni↓ is the same as for Fi↓. 43
6.4. Applications.
o The polygon of cumulative frequencies are given
below: a) Polygon of Ni↑ and b) Polygon of Ni↓.

Polygon of Ni↑ (or Fi ↑) Polygon of Ni↓ (or Fi ↓)

(we join the upper limit) (we join the lower limit)
44
Lebanese University

Faculty of Pedagogy 1, Unesco

Dr. Ali Allam

45
Chapter 3

Measuring centers

46
Outline
1. The Mean Ẍ.

2. The Mode Mo.

3. The Median Me.

4. The Quantiles.

5. Box Plot.

6. The relation between the measuring centers.

7. Other types of mean.

47
1. The Mean Ẍ
o Ẍ is calculated only for quantitative variable.

o Ẍ is called the arithmetic mean and is the most

usable measuring center for data analysis.

o Ẍ is calculated for:

1) Simple series.

2) Discrete variable.

3) Quantitative variable.
48
1.1. The Mean Ẍ
o For a simple series: X1, X2, X3, …

N
Xi
X 
i 1 N

o Example: Calculate the mean for the following data:

2–3–3–5–3–4

N
Xi 2  3  3  5  3  4
X    3.33
i 1 N 6

49
1.2. The Mean Ẍ
o For a discrete variable where the data are collected
in a table:
N N
ni X i
X    fi X i
i 1 N i 1

o Example: Calculate the mean for the following data:

Xi 2 6 8 Total
ni 4 3 1 8

N
ni X i ( 2  4)  (6  3)  (8  1)
X    4.25
i 1 N 8
50
1.3. The Mean Ẍ
o For a continuous variable where the data are
collected in a table:
ab
N N
niCi
X    f iCi and Ci 
i 1 N i 1 2
where Ci is the center of the interval [a; b].

o Example: Calculate the mean for the following data:

Xi [0; 2[ [2; 4[ [4; 6[ Total
ni 2 5 3 10
Ci 1 3 5 -

N
niCi ( 2  1)  (5  3)  (3  5)
X    3 .2
i 1 N 10 51
2. The Mode Mo
o Mo is calculated for qualitative and quantitative
variables.

o Mo represents the most repeated data which possess

the highest frequency ni (or fi).

o Some distributions are Bi-modal, Tri-modal, …

o Mo is calculated for:
1) Qualitative variable.
2) Simple series.
3) Discrete variable.
4) Quantitative variable.
52
2.1. The Mode Mo
o Mo is the data that possess the highest ni (or fi).

o Graphically, Mo corresponds to the tallest bar.

o Example: Brazil is the Mo.

Xi Italy Argentina Brazil Germany Total
ni 4 2 5 4 15

53
2.2. The Mode Mo
o For a simple series: X1, X2, X3, …

o Mo is the most repeated value.

o Example: Identify the Mode Mo for the following

data:

2–3–3–5–3–4

o Solution: ‘3’ is the most repeated value

⇛ Mo = 3.
54
2.3. The Mode Mo
o For a discrete variable : Mo has the highest
frequency ni (or fi).

o Graphically, Mo corresponds to the tallest bar.

o Example: Identify the Mo for the following data:

Xi 2 6 8 Total
ni 4 3 1 8

Mo = 2

55
2.4. The Mode Mo
o For a continuous variable, we identify the modal
class which possess the highest frequency, then:

1 1  ni  ni 1
Mo  Li  ai and
1   2  2  ni  ni 1

where Li is the lower limit of the modal class

ai is the width of the modal class

o Example: Let [a; b] is the modal class:

Li = a and ai = b – a
56
2.4. The Mode Mo
o Example: Calculate the mode for the following data:

Xi [0; 2[ [2; 4[ [4; 6[ Total

ni 2 5 3 10

o [2; 4[ is the modal class:

(5  2 ) 3
Mo  2  (4  2) 22  3.2
(5  2)  (5  3) 3 2

o Mo = 3.2 that belongs to [2; 4[.

57
2.4. The Mode Mo
o Graphically, for a continuous variable, we identify
the modal class in the histogram which possess the
tallest bar.

o Then we estimate the Mo graphically.

58
3. The Median Me
o Me represents the central value of a statistical
distribution, and it cuts the statistical series into 2
equal distributions.

o There exist only one median Me for one distribution.

o Me is calculated for:
1) Some Qualitative variable.
2) Simple series.
3) Discrete variable.
4) Quantitative variable.
59
3.1. The Median Me
o For qualitative data, two scales exist:

1) Nominal scale: are used for labeling variables,

without any quantitative value (Ex:
Green/blue/white/…, Male/Female, …). Here, the
Me cannot be estimated and is not meaningful.

2) Ordinal scale: the order of the values is important

and significant (Ex: unhappy/very happy/
happy/…). Here, the Me can be estimated and is
meaningful.
60
3.2. The Median Me
o For a simple series: X1, X2, X3, …

o Me corresponds to N/2.

o Example: Identify the Median Me for the following

data:
2–3–3–5–3–4
o Solution: we class the series by order:
2–3–3–3–4–5

N = 6, then N/2 = 6/2 = 3

Thus Me = 3 61
3.3. The Median Me
o For a discrete variable:
N  is  odd : Me  X N 1
2

XN  XN
1
N  is  even : Me  2 2
2

o Example: N = 8 (even).
Xi 2 6 8 Total
ni 4 3 1 8

o Me = (X4 + X5)/2 = (2 + 6)/2 = 4.

62
3.3. The Median Me
o Example: N = 9 (odd).
Xi 1 5 7 Total
ni 2 6 1 9

o Me = X (10/2) = X5 = 5.

o Graphically, Me intersects the polygon of increasing

or decreasing cumulative frequency by y = N/2.

o Graphically, Me intersects the polygon of increasing

or decreasing cumulative relative frequency by y =
0.5.
63
3.4. The Median Me
o For a continuous variable, we identify the median
class which corresponds to N/2 according to the
value of the increasing cumulative frequency:
N
 N i 1
Me  Li  ai 2
ni
where Li is the lower limit of the median class
ai is the width of the median class
ni is the frequency of the median class
N is the total frequency
N(i-1) is the increasing cumulative frequency for
the class before the median class.
64
3.4. The Median Me
o If ‘N’ is odd, we replace N/2 by (N+1)/2 in the
formula.
o Example: Calculate the Median Me.
Xi [0; 2[ [2; 4[ [4; 6[ Total
ni 2 5 3 10
Ni ↑ 2 7 10

o Solution: N = 10 (even).
o N/2 = 5.
o [2; 4[ is the median class.
52
Me  2  2  3.2 and 3.2 belongs to [2; 4[.
5
65
3.4. The Median Me
o Graphically, for a continuous variable, Me is the
vertical axis which divides the histogram into two
parts with equal areas.

66
3.4. The Median Me
o Graphically, for a continuous variable, Me
intersects the polygon of increasing or decreasing
cumulative frequency by y = N/2 (Me intersects the
polygon of increasing or decreasing cumulative
relative frequency by y = 0.5).

67
3.4. The Median Me
o Graphically, for a continuous variable, Me is the
point of intersection of the polygon of increasing
and decreasing cumulative frequencies (Me is the
point of intersection of the polygon of increasing or
decreasing cumulative relative frequencies.

68
4. The Quantiles
o The quantiles are values taken from regular intervals
of the quantile function of a variable.

o The quantiles have special names. Some of these

quantiles, are:

1) Quartiles Q.

2) Deciles D.

3) Percentiles P.

69
4.1. The Quartiles Q
o The quartiles Q are values that divide the statistical
series into 4 equal parts.

o They are named Q1, Q2 and Q3.

70
4.1. The Quartiles Q
o Q2 divides the series into 50% - 50%. Thus, Q2 = Me.

o For a continuous variable, and by using the formula:

N
 N i 1
Q1  Li  ai 4
ni
Q2  M e
3N
 N i 1
Q3  Li  ai 4
ni
71
4.2. The Deciles D
o The deciles D are values that divide the statistical
series into 10 equal parts.

o They are named D1, D2, D3, D4, D5 = Me, … and D9.

72
4.3. The Percentiles P
o The percentiles P are values that divide the
statistical series into 100 equal parts.

o They are named P1, P2, P3, P4, P50 = Me, … and P99.

73
5. Box Plot
o The box plot indicates the dispersion of a given
series.

o To draw a box plot, we:

1) We draw a horizontal (or vertical) axis.

2) We represent the following values: Minimum, Q1,

Me, Q3 and maximum.

3) We construct a rectangle parallel to the axis with a

length equal to the Interquartile Range IQ.
74
5. Box Plot
o The interquartile range IQ, is:

IQ  Q3  Q1

o An example of a box plot, is:

o In this example: Min = 71, Q1 = 210, Me = 268, Q3 =

342 and Max = 741.
75
6. The relation between the measuring centers
o The Median is always comprised between the Mode
and the Mean.

76
7. Other types of mean
o The Geometric mean G: To measure the growth.

N
1
log G 
N
 n log X
i 1
i i

o The Harmonic mean H: it depends on lowest values

(however, the arithmetic mean Ẍ depends on highest
values).
N
H N
ni

i 1 X i

77
7. Other types of mean
o The Quadratic mean Q: is used to calculate the
standard deviation.
N
1
Q  n X
2 2
i i
N i 1

o The relation between these means, is:

H G X Q
78
Lebanese University

Faculty of Pedagogy 1, Unesco

Dr. Ali Allam

79
Chapter 4

Measures of Spread

80
Outline
1. The Range W.

2. The Interquartile Range IQ.

3. The Mean and the Median absolute Deviations.

4. The Variance V(X).

5. The Standard Deviation σ(X).

6. The Coefficient of Variation CV.

7. The Approximation Errors.

81
1. The Range W
o The range of a set of data is the difference between
the largest and smallest values. It is measured in the
same units as the data.

W  Highest  Lowest

o It is used in representing the dispersion of a data.

o For a continuous distribution, the range is the

difference between the highest boundary of the last
class and the lowest boundary of the first class (or
the difference between the centers of the last class
and the first class). 82
1.1. The Range W
o Example for a simple series or a discrete variable:

Let the following data: 4 – 5 – 16 – 18 – 3

W  18  3  15
o Example for a continuous variable:
Xi [0; 2[ [2; 4[ [4; 6[ Total
ni 2 5 3 10
Ci 1 3 5 -

W  60  6
W  5 1  4
83
2. The Interquartile Range IQ
o IQ is the difference between the 3rd and the 1st
quartiles Q3 and Q1. The IQ is used to build box
plots.
IQ  Q3  Q1
o IQ is used to calculate the spread around the Median.

o 50% is the population exist in the IQ range.

o The IQ is often used to find outliers in data. Outliers

are observations that fall below Q1 - 1.5(IQ) or
above Q3 + 1.5(IQ).
84
2.1. The Interquartile Range IQ
o Example for a continuous distribution: We study the
salary is $ for 130 employees. Calculate IQ?

Xi 700-900 900-1100 1100-1300 1300-1500 1500-1700 Total

ni 15 25 55 28 7 130
Ni ↑ 15 40 95 123 130 -

o Q1: N(Q1) = 130/4 = 32.5

900-1100 is the class of Q1
Q1 = 1044$

o Q3: N(Q3) = (3*130)/4 = 97.5

1300-1500 is the class of Q3
Q3 = 1314$. Thus IQ = 1314 – 1044 = 270$. 85
3.1. The Mean Absolute Deviation
o The mean absolute deviation AD(Mean) is the mean
of the data's absolute deviations around the data's
mean: the average (absolute) distance from the
mean.

n i Xi  X
AD ( Mean)  i 1
N

86
3.2. The Median Absolute Deviation
o The median absolute deviation AD(Median) is the
mean of the data's absolute deviations around the
data's median: the average (absolute) distance from
the median.

n i X i  Me
AD ( Median)  i 1
N

87
4. The Variance V(X)
o The Variance is the square of the standard deviation
and it measures how far a set of numbers are
spread out from their mean.

 n X X
N
2
i i
V ( X )  Var ( X )  i 1
N
o The variance is always positive or zero.

o The variance of a constant random variable is zero,

and if the variance of a variable in a data set is zero,
then all the entries have the same value. 88
4. The Variance V(X)
o The properties of the variance, are:

V (X )  0
V (a)  0
V ( X  a)  V ( X )
V (aX )  a V ( X )
2

V (aX  bY )  a V ( X )  b V (Y )  2abCov( X , Y )
2 2

V (aX  bY )  a V ( X )  b V (Y )  2abCov( X , Y )
2 2

89
4. The Variance V(X)
o Another formula of the variance, is:

 n X X
N
2

 
i i N
1
V (X )  i 1

N

N
 i i
n X
i 1
2
 X 2
 2Xi X
N N N
1 1 2

N

i 1
ni X i 
2

N

i 1
ni X 
2

N
n X
i 1
i i X
N N
1 N 2 2X

N

i 1
ni X i 
2

N
X 
N
n Xi 1
i i

N
1

N
 i i
n X
i 1
2
 X 2
 2 X .X
N
1
 V (X ) 
N
 i i
n X
i 1
2
 X 2

90
5. The Standard Deviation σ(X)
o The standard deviation is the square root of the
variance. It is a measure that is used to quantify the
amount of variation or dispersion of a set of data
values.

 (X )  V (X )

o A low standard deviation indicates that the data

points tend to be close to the mean of the set.

o A high standard deviation indicates that the data

points are spread out over a wide range of values. 91
5. The Standard Deviation σ(X)
o The properties of the standard deviation, are:

 (X )  0
 (a)  0
 ( X  a)   ( X )
 (aX )  a  ( X )
 ( X  Y )  V ( X )  V (Y )  2Cov( X , Y )

92
6. The Coefficient of Variation CV.
o The coefficient of variation CV (also known as
relative standard deviation) is defined as the ratio
of the standard deviation to the mean.

CV 
X
o The CV is often expressed as a percentage.

%CV  100
X
o CV is a dimensionless number.
o In analytical chemistry, CV is used to express the
precision and repeatability of an assay. 93
7. The Approximation Errors
o The approximation error in some data is the
discrepancy between an exact value and some
approximation to it.

o The most three common errors, are:

Absolute Error : AE  X  X i
X  Xi
Re lative  Error : RE 
Xi
Percent  Error : PE  RE 100
94
Lebanese University

Faculty of Pedagogy 1, Unesco

Dr. Ali Allam

95
Chapter 5

Bivariate Distribution

96
Outline
1. Introduction.

2. Presentation in Table.

3. Means and Variances.

4. Covariance Cov(X,Y).

5. Linear Correlation Coefficient “r”.

6. Graphical Study.

7. Regression Lines.
97
1. Introduction
o In this chapter, we will study the relation between
two different variables or data. These variables may
be qualitative, discrete or continuous.

o The 1st variable is noted Xi (is the explanatory

variable), and the 2nd variable is noted Yj (is the
dependent variable).

o The relation between X and Y can be positive or

negative.

o To simplify the study, the frequency is always equal

to “1”. 98
2. Presentaion in Table
o The frequency of every observation for Xi or Yj
corresponds to “1”.

Observation 1 2 3 4 Total

Xi X1 X2 X3 X4 -

Yj Y1 Y2 Y3 Y4 -

Frequency 1 1 1 1 4

99
3. Means and Variances
o For the variable Xi:
N

X i
X  i 1

 n X X  X X
N N
2 2
i i i
V (X )  i 1
 i 1

N N
N
1
V (X ) 
N
X
i 1
i
2
X 2

100
3. Means and Variances
o For the variable Yj:
N

Y
j 1
j

Y 
N

 n Y X  Y X
N N
2 2
j j j
j 1 j 1
V (Y )  
N N
N
1
V (Y ) 
N
Y
j 1
j
2
Y 2

101
4. Covariance Cov(X,Y)
o The covariance Cov(X,Y) is a measure of how much
two random variables vary together. It’s similar to
variance, but where variance tells you how a single
variable varies, covariance tells you how two
variables vary together:

N N
1
Cov( X , Y ) 
N
 ( X
i 1 j 1
i  X )(Y j  Y )

N N
1
Cov( X , Y ) 
N
 X Y
i 1 j 1
i j  XY

102
4. Covariance Cov(X,Y)
o The properties of the covariance, are:

Cov( X  b, Y  d )  Cov( X , Y )
Cov(aX  b, cY  d )  a.c.Cov( X , Y )
Cov( X , X )  V ( X )
Cov( X , Y )   X . Y
V ( X  Y )  V ( X )  V (Y )  2Cov( X , Y )

o If X and Y are independent, thus Cov(X,Y) = 0 and

V(X+Y) = V(X) + V(Y). The opposite is not true.
103
5. Linear Correlation Coefficient “r”
o The linear correlation coefficient “r” is a measure of
the intensity of the relation of two different
variables.
Cov( X , Y ) Cov( X , Y )
r 
V ( X ).V (Y )  X . Y

o The value of “r” does not vary with the change of the
origin or the scale in the graph.

o If X and Y are independent, thus r=0. The opposite

is not true.
104
5. Linear Correlation Coefficient “r”
o The properties of the linear correlation coefficient
“r”, are:

 1  r  1
* r  1 : Strong  Positive  Correlation
* r  1 : Strong  Negative Correlation
* r  0 : Weak  Correlation
* r  0 : No  Correlation

o The sign of “r” depends on the sign of the Cov(X,Y).

105
6. Graphical Study
o The relation between X and Y is defined using a
mathematical equation which links these two
variables.

o We represent every observation (Xi,Yj) with a point

in a Cartesian graph.

o The equation of curves that link X and Y can be:

Line, Parabola, Hyperbola, Cubic curve, Exponential,
Logarithmic, Power function, ….

o In this chapter, we will the equation of line.

106
6. Graphical Study
o The totality of points plotted is named “Dispersion
diagram” or “Cloud of points”.

o We always plot the Mean Point named G(Ẍ, Ÿ) (or

the center of gravity).

107
6. Graphical Study
o Example: Strong Positive Linear Correlation: r ↦ +1.

o X increases and Y increases, or X decreases and Y

decreases.

108
6. Graphical Study
o Example: Strong Negative Linear Correlation: r ↦ -1.

o X increases and Y decreases, or X decreases and Y

increases.

109
6. Graphical Study
o Example: No Correlation: r ↦ 0. Thus Cov(X,Y) ↦ 0.

o No relation between X and Y.

110
7. Regression Lines
o The regression line of Y on X is noted (D):

( DY / X ) : Y  aX  b

o To calculate a and b:

Cov( X , Y )
a
V (X )
b  Y  aX
111
7. Regression Lines
o The regression line of X on Y is noted (D’):

( D X / Y ) : X  a ' Y  b'
'

o To calculate a’ and b’:

Cov( X , Y )
a' 
V (Y )
b'  X  a ' Y
112
7. Regression Lines
o The relation between a and a’, is:

r  a.a '
2

r a.a '

o (D) and (D’) intersects into the mean point G.

113
Calculator
o Shift Mode 3 = =

o Mode 3 1 REG

o Xi , Yj M+

o Shift 2: Ẍ, Ÿ, σ(X), σ(Y). Thus: V(X), V(Y).

o Shift 2: B = a
A=b
r
o Y = aX + b = BX + A (On calculator).
114
Lebanese University

Faculty of Pedagogy 1, Unesco

Dr. Ali Allam

115
Chapter 6

Events and Probability

116
Outline
1. Introduction.

2. Intersection and Union.

3. Disjoint and Independent Events.

4. Properties of Operations on Events.

5. Uniform and Statistical Probability.

6. Conditional Probability (The Tree).

7. Permutations, Arrangements and Combinations.

117
1. Introduction
o An event is a collection of outcomes of an
experiment to which a probability is assigned, and it
is noted A.

o The set of all possible outcomes of a probability

experiment is called sample space Ω. Thus P(Ω) = 1.

o Ω = (A, B, C, D, ……).

o The empty set ϕ or impossible event is always

unrealized. Thus P(ϕ) = 0.

o ϕ = ( ). 118
1. Introduction
o An event A is a subset of the sample space.

A Ω

o Any event A which consists of a single outcome in

the sample space is called an elementary or simple
event. An example of an elementary event: A = (4).

o Thus Cardinal (A) = Card (A) = 1, if A is an

elementary event. 119
1. Introduction
o The complement event Ā is associated to the event A.

o Ā refers to elements not in A.

A Ä
o Ä is realized if A is not realized and vice versa.

o Thus: P( A )  1  P( A)
120
2.1. Intersection
o The intersection A ∩ B of two sets A and B is the set
that contains all elements of A that also belong to B
(or equivalently, all elements of B that also belong to
A), but no other elements.

o A ∩ B = A and B.
121
2.2. Union
o The union A ∪ B of two sets A and B is the set of
elements which are in A, in B, or in both A and B

o A ∪ B = A or B.
122
3.1. Disjoint Events
o Two disjoint events are two events that do not occur
at the same time, and no elements are common to
both.

o Thus: A ∩ B = ϕ, and P(A ∩ B ) = 0, and:

P ( A  B )  P ( A)  P ( B )  P ( A  B )
P ( A  B )  P ( A)  P ( B )
123
3.2. Independent Events
o A and B are two independent events if they are not
linked. Thus A and B are unrelated events; the
outcome of one event does not impact the outcome
of the other event.

o A and B are two independent events, if:

P( A  B)  P( A)  P( B)

124
4. Properties of Operations on Events
o Let A, B and C three events:

1) Commutativity:

A B  B  A
A B  B  A

2) Associativity:

A  ( B  C )  ( A  B)  C
A  ( B  C )  ( A  B)  C
125
4. Properties of Operations on Events
3) Distributivity:
A  ( B  C )  ( A  B)  ( A  C )
A  ( B  C )  ( A  B)  ( A  C )
4) Negation rule:

A  A;   ;   
A  B  A B
A  B  A B
126
5. Uniform and Statistical Probability
1) Uniform Probability: Let Ω the sample space and A
an event of Ω. Thus, the probability of A, is:

Number.of .ways.which. A.can.happen

P( A) 
Number.of .all. possible.ways

2) Statistical Probability: The probability of an event

A is the relative frequency “f” of the realization of
this event. Thus, the probability of A, is:
P( A)  f ( A)
127
6. Conditional Probability (The Tree)
o The conditional probability of an event A, given that
the event B has occurred is denoted P(A/B) and is
readed “Probability of A given B”.

P( A  B)
P( A / B) 
P( B)
o Thus, the probability of B given A, P(B/A) is:

P( A  B)
P( B / A) 
P( A)
128
6. Conditional Probability (The Tree)
o If A and B are two independent events, thus:

P( A  B) P( A)  P ( B)
P( A / B)    P( A)
P( B) P( B)
P( A  B) P( A)  P ( B)
P ( B / A)    P( B)
P ( A) P( A)

129
6. Conditional Probability (The Tree)
o The Bayes theorem is a way to figure out
conditional probability.
P( B / A)  P( A)
P( A / B) 
P( B)
P( A / B)  P( B)
P( B / A) 
P( A)

o Other form of the formula, is:

P( B / A)  P( A)
P( A / B) 
P( B / A)  P( A)  P( B / A )  P( A )
130
7.1. Permutations
o Permutation: Is an ordered set of “n” objects.

o The number of permutations, is:

Nb.of .Permutations  n!

o Example: For a set of А, В, С.

o The number of permutations is 3! = 6. The

permutations, are: АВС, АСВ, ВАС, ВСА, САВ and
СВА. 131
7.2. Arrangements
o Arrangements: We choose “p” objects from “n”
objects in a certain order.

o The number of arrangements, is:

n!
A 
p

(n  p)!
n

o Example: For a set of А, В, С.

o The number of arrangements of 2 from 3, is equal to
3!/(3-2)!= 3!/1!= 6. The arrangements, are: АВ, BA,
AC, CA, BC and CB.
132
7.3. Combinations
o Combinations: We choose “p” objects from “n”
objects without any order.

o The number of combinations, is:

p
n! A
C p
 n
p!(n  p)! p!
n

o Example: For a set of А, В, С.

o The number of combinations of 2 from 3, is equal to
3!/2!(3-2)!= 3!/2!= 3. The combinations, are: АВ, AC
and BC.
133
Notes
o Some notes of the probability:
0  P( A)  1
P()  1; P( )  0
A A 
A A  

o If A ⊂ B, thus:

A⋂B=A
and A ⋃ B = B
134
Notes
o Let A and B two events, thus:

A  ( A  B)  ( A  B )
P( A)  P( A  B)  P( A  B )
135

Chapter 1 MATHS 2
No ratings yet
Chapter 1 MATHS 2
13 pages
Chapter1 Data Description PDF
No ratings yet
Chapter1 Data Description PDF
24 pages
Statistics Ns 20231
No ratings yet
Statistics Ns 20231
49 pages
3rd-qtr-stats-reviewer
No ratings yet
3rd-qtr-stats-reviewer
24 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
M 301 - Ch1 - Introduction To Statistics
No ratings yet
M 301 - Ch1 - Introduction To Statistics
96 pages
GET 321_compressed
No ratings yet
GET 321_compressed
32 pages
1st Mid
No ratings yet
1st Mid
19 pages
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
No ratings yet
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
40 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Collection of Data Part 2 Edited MLIS
No ratings yet
Collection of Data Part 2 Edited MLIS
45 pages
chapter 3 descriptive biostatistics
No ratings yet
chapter 3 descriptive biostatistics
103 pages
Introduction To Statistics and SPSS
100% (1)
Introduction To Statistics and SPSS
110 pages
Lecture 1
No ratings yet
Lecture 1
28 pages
Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)
No ratings yet
Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)
118 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
34 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
50 pages
Data Types: and Its Representation Session - 2 & 3
No ratings yet
Data Types: and Its Representation Session - 2 & 3
33 pages
Introduction To Statistics Presentation of Data
No ratings yet
Introduction To Statistics Presentation of Data
20 pages
Lecture 1, 2 and 3_d21432a1071b0bf181cd2be654ea33bb
No ratings yet
Lecture 1, 2 and 3_d21432a1071b0bf181cd2be654ea33bb
45 pages
2. presenting of data_١١١٠٥٩
No ratings yet
2. presenting of data_١١١٠٥٩
39 pages
Intro To Statistics Lecture
No ratings yet
Intro To Statistics Lecture
41 pages
2- Presenting Data Part
No ratings yet
2- Presenting Data Part
42 pages
Unit 4 Quantitative Analysis and Interpretation
No ratings yet
Unit 4 Quantitative Analysis and Interpretation
10 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
39 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
Math
No ratings yet
Math
13 pages
Topic 1 Descriptive Statistics SV
No ratings yet
Topic 1 Descriptive Statistics SV
113 pages
PROBABILITY Lecture 1 - 2 - 3
No ratings yet
PROBABILITY Lecture 1 - 2 - 3
63 pages
Chapter I
No ratings yet
Chapter I
29 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
Course Code & Number:FET201
No ratings yet
Course Code & Number:FET201
70 pages
Intro of Statistics - Ogive
No ratings yet
Intro of Statistics - Ogive
35 pages
Chapter 1 BFC34303
No ratings yet
Chapter 1 BFC34303
104 pages
CAS_Descriptive Statistics_Final PPT-1
No ratings yet
CAS_Descriptive Statistics_Final PPT-1
112 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
ABE 322 Sta Class 1-2
No ratings yet
ABE 322 Sta Class 1-2
35 pages
Trust wallet spamming
No ratings yet
Trust wallet spamming
50 pages
Biostatistics Notes-numbered
No ratings yet
Biostatistics Notes-numbered
21 pages
_ Unit 2 _ Descriptive Analytics
No ratings yet
_ Unit 2 _ Descriptive Analytics
85 pages
Unit 1 - Examining Distributions
No ratings yet
Unit 1 - Examining Distributions
80 pages
Biostatistics Biochemistry 1
No ratings yet
Biostatistics Biochemistry 1
22 pages
DescriptiveStatistics
No ratings yet
DescriptiveStatistics
23 pages
EA311 lecture note one
No ratings yet
EA311 lecture note one
33 pages
S M E: D S: Tatistics With Atlab For Ngineers Escriptive Tatisics
No ratings yet
S M E: D S: Tatistics With Atlab For Ngineers Escriptive Tatisics
16 pages
summry biostatstics pptx
No ratings yet
summry biostatstics pptx
32 pages
Statistika Elementer
No ratings yet
Statistika Elementer
65 pages
Biostatistics Module 3
No ratings yet
Biostatistics Module 3
9 pages
Statistics Review
No ratings yet
Statistics Review
59 pages
Part1 141104090445 Conversion Gate01
No ratings yet
Part1 141104090445 Conversion Gate01
27 pages
Math11n PPT 3.1
No ratings yet
Math11n PPT 3.1
40 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
Stats Methods
No ratings yet
Stats Methods
22 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Review of Statistical Concepts
No ratings yet
Review of Statistical Concepts
60 pages
Chapter 1 BFC34303 (Lyy)
No ratings yet
Chapter 1 BFC34303 (Lyy)
104 pages
1 Introduction of The Nature of Statistics and Frequency Distributions and Graph
No ratings yet
1 Introduction of The Nature of Statistics and Frequency Distributions and Graph
13 pages
Topic 2- Descriptive_statistics
No ratings yet
Topic 2- Descriptive_statistics
36 pages
Psych Stats Sum
No ratings yet
Psych Stats Sum
10 pages