MATH 322: Probability and Statistical Methods
MATH 322: Probability and Statistical Methods
METHODS
LECTURE SLIDES
CHAPTER 1
DESCRIPTIVE STATISTICS
2
▪ Populations, Samples,
and Processes.
▪ Measures of Variability
Exercises
DESCRIPTIVE STATISTICS
Population: In collecting data it is often impossible or impractical to observe the entire for example
sands on the beach, number of defective bolts produced in a factory in a given day, all possible
outcomes in successive tosses of a fair coin, etc.. Therefore instead of examining the entire group called
the Population (universe), one examines a small part of it which represents the group, called Sample. A
population can be finite or infinite.
Raw Data: Collected data which does not need to be numerical. i.e. weights of certain set of students,
days of the week, etc.
Array: Arrangement of raw numerical data in ascending or descending order.
Class Interval: A class interval is a division of data for use in Histogram(a type of Bar graph). For
instance, it is possible to partition scores on a 100 point test into class intervals of 1-25, 26-49,
50-74 and 75-100. The end numbers are called class limits; the smaller numbers are Lower
Class Limits (LCL) and the larger numbers are the Upper Class Limits (UCL). The numbers 0.5-
25.5, 25.5-49.5, 49.5-74.5, 74.5-100.5 are called class boundaries. For example 0.5 is a lower
class boundary and 25.5 is an upper class boundary of the first class.
Class Interval Size (widthness=c): Upper class boundary – lower class boundary.
FREQUENCY
10
3.7
8
6
4 4.2
2.7
2 4.7
1.7 2.2
0
1.5-1.9 2.0-2.4 2.5-2.9 3.0-3.4 3.5-3.9 4.0-4.4 4.5-4.9
CLASS MIDPOINTS
BATTERY LİFES (İN YEARS)
FREQUENCY POLYGONS:
A frequency polygon is drawn exactly like a histogram except that points are drawn rather
than bars. The X-axis begins with the midpoint of the interval immediately lower than the
lowest interval and ends with the interval immediately higher than the highest interval.
Relative Frequency of a class: It is a percentage which is obtained by dividing the frequency of the
class to the total frequency of all classes.
Relative Frequency Distribution: Arrangement of data by classes together with the corresponding
relative frequencies.
Cumulative Frequency: The total frequency of all values less than the upper class boundary of a
given class interval is called the cumulative frequency upto and including that class interval.
Plotting scores on the X-axis and the cumulative frequency on the Y-axis draws the Ogive
(cumulative frequency polygon). The points are plotted at the intersection of the upper class
boundary of the interval and the cumulative frequency
HOW TO DRAW AN OGIVE ?
Class Boundaries Frequency (fi) Class Boundaries Cummulative Frequencies (cfi)
1.45-1.95 2 less then 1.45 0
1.95-2.45 1 less then 1.95 2
2.45-2.95 4 less then 2.45 3
2.95-3.45 15 less then 2.95 7
less then 3.45 22
3.45-3.95 10
less then 3.95 32
3.95-4.45 5 less then 4.45 37
4.45-4.95 3 less then 4.95 40
45
4.95, 40
40 4.45, 37
35 3.95, 32
Cummulative Frequencies (cfi)
30
25 3.45, 22
20
15
10 2.95, 7
5 2.43, 3
1.95, 2
1.45, 0
0
0.00 1.00 2.00 3.00 4.00 5.00 6.00
Class Boundaries
HOW TO DRAW AN OGIVE ?
45
4.95, 40
40
4.45, 37
35
3.95, 32
30
10
Class Boundaries Cummulative Frequencies (cfi) 2.95, 7
Ogive: The graph showing the cumulative frequency less than any upper-class boundary is called a cumulative-
frequency polygon or Ogive.
Cummulative
Cummulative Relative
RelativeCummulative
Cummulative
Percentage Ogive)
Class
ClassBoundaries
Boundaries Frequencies
Frequencies(cfi)
(cfi) Frequencies
Frequencies(cfi/N)
(cfi/N) 1.2
less 0.175
lessthen
then 3.95
3.95 32
32 0.800
0.800 0.2
0.05 0.075
less
lessthen
then 4.45
4.45 37
37 0.925
0.925 0
0
less
lessthen
then 4.95
4.95 40
40 1.000
1.000 0 1 2 3 4 5 6
Class Boundaries
EXAMPLE 1.1:
The frequency distribution of the ages of sample of 400 diabetics obtained by a research physician are
given below
MEASURES OF LOCATION
The Sample Mean: One obvious and very useful measure is the Sample Mean. The mean is simply a
numerical average.
Dispersion: The degree to which nimerical data tend to spread about an average value is called the dispersion
or variation. The most common measures of dispersions are range,variance and standard deviation.
Sample Range: The simplest measure of variability (dispersion) is the sample range Xmax - Xmin .
Definition:
EXAMPLE 2:
A manufacturer of electronic components is interested in determining the lifetime of a certain type of
battery. A sample, in hour of life, is as follows:
123, 116, 122, 110, 175, 126, 125, 111, 118, 117
(a) Construct the frequency table using the following classes: 2-7, 8-13, 14-19, 20-25, 26-31, 32-37.
(b) Draw the relative cumulative frequency Histogram and the Percentage Ogive.
(c) Estimate the percentage of houses whose age is under 15 years.
a) Class Frequencies
Intervals
2-7 6
8-13 10
14-19 7
20-25 4
26-31 2
32-37 1
b) Percentage Ogive
Class Cumulative Relative Cumulative
Boundaries Ferquencies Frequencies 1.20
31.5, 0.97 37.5, 1
RELATIVE CUMULATIVE
< 1.5 0 0.00 25.5, 0.9
< 7.5 6 0.20
1.00
19.5, 0.77
FREQUENCIES
< 13.5 16 0.53 0.80
< 19.5 23 0.77 13.5, 0.53
0.60
< 25.5 27 0.90
< 31.5 29 0.97 0.40 7.5, 0.2
< 37.5 30 1 0.20 1.5, 0
0.00
(15−13.5) 0 5 10 15 20 25 30 35 40
c) 6+10+ × 7 = 17.75 ≈ 18
6 CLASS BOUNDARIES
Example: Complete the tables given below.
Example: The marks obtained by 40 students out of 50 in a class are given below in the
table. Find the mode of the below data.
Marks (in $) 42 36 30 45 50
Number of
7 10 13 8 2
Students
Mean = 39 1/11;
Mode = 16;
Median = 16