Week3 Frequency Analysis
Week3 Frequency Analysis
1
Frequency Analysis and Parameter Estimation
3
Characteristics of a Sample to be Quantitatively
Adequate
4
Uncertainty
• Estimates made for the properties of the population
(probability distribution function, parameters) by statistical
analysis of the sample are not equal to the real values of the
population
5
Frequency Analysis
It is not possible to observe all the population of a random
variable, so it is assumed that probability distribution is
equivalent to the frequency distribution obtained by the
analysis of sample.
6
Frequency Analysis - Definitions
Raw Data
Raw data are collected data that have not been organized
numerically. An example is the set of weights of 100 male students
obtained from an alphabetical listing of university records.
Array
An array is an arrangement of raw numerical data in ascending or
descending order of magnitude.
Range
The difference between the largest and smallest numbers is called
the range of the data. For example, if the largest weight of 100 male
students is 74 kg and the smallest weight is 60 kg, the range is 14 kg.
7
Frequency Distributions
When summarizing large masses of raw data, it is often useful to
distribute the data into classes, or categories, and to determine the
number of individuals belonging to each class, which is called as the
class frequency.
8
Frequency Distributions
Table 2.1 Weights of 100 Male Students at MAT271E Course
66 68 63 68 68 67 70 69 69 63
67 67 67 67 68 67 70 71 69 64
66 68 64 67 68 66 67 70 69 64
66 68 65 62 64 65 70 68 66 61
68 70 70 67 69 64 64 65 69 65
71 68 68 67 66 68 63 71 71 65
68 66 67 67 69 66 66 68 65 70
70 70 69 67 68 69 71 62 64 71
68 66 67 67 69 66 70 70 63 60
61 73 72 64 74 74 73 73 72 74
The terms class and class interval are often used alternately, although the
class interval is actually a symbol for the class.
Class Mark
Class mark is the midpoint of class interval and calculated by dividing the
sum of lower and upper class limits with 2.
10
Histograms and Frequency Polygons
Class Boundaries
11
Histograms and Frequency Polygons
Histogram is the graphic representation of frequency distributions.
If the class intervals all have equal size, the heights of the rectangles
are proportional to the class frequencies, and it is then customary to
take the heights numerically equal to the class frequencies.
12
Histograms and Frequency Polygons
The histogram and frequency polygon corresponding to the frequency
distribution of weights in Table 2.1 are shown on the same set of axes in
the Figure.
13
Relative Frequency Histogram
If the frequencies in Table 2.1 are replaced with the corresponding relative
frequencies, the resulting table is called a relative–frequency distribution,
percentage distribution, or relative–frequency table.
14
Relative Frequency Histogram
The relative frequency histogram is the frequency of the class divided by
the total frequency of all classes and is generally expressed as a percentage.
For example, the relative frequency of the class 66–68 in Table 2.1 is
42/100 = 42%.
15
Cumulative–Frequency Distributions and Ogives
The total frequency of all values less than the upper class boundary of a
given class interval is called the cumulative frequency up to and including
that class interval. For example, the cumulative frequency up to and
including the class interval 66–68 in Table 2.1 is 5 + 18 + 42 = 65, signifying
that 65 students have weights less than 68.5 kg.
16
Cumulative–Frequency Distributions and Ogives
17
Cumulative–Frequency Distributions and Ogives
A graph showing the cumulative frequency less than any upper class
boundary plotted against the upper class boundary is called a
cumulative–frequency polygon, or ogive, and is shown in Fig. 2–2
for the student weight distribution of Table 2.1.
18
Cumulative–Frequency Distributions and Ogives
19
Frequency Analysis of Continuous Variables
20
Frequency Analysis of Continuous Variables
21
Frequency Analysis of Continuous Variables
22
23
24
General Rules for Forming Frequency Distributions of
Continuous Data
1. Determine the largest and smallest numbers in the raw data and
find the range (the difference between the largest and smallest
numbers).
2. Divide the range into a convenient number of class intervals
having the same size. You can find the convenient number of
class by using the formula
M=1+3.3 log N
25
General Rules for Forming Frequency Distributions of
Continuous Data
26
Frequency Analysis
27
Frequency Analysis
28
General Rules for Forming Frequency Distributions of
Discrete Data
Since the discrete data sets include only integer numbers, it may
not be necessary to cumulate the value in order to form intervals in
general. Instead, each number may be directly used as a class limit.
If you have more dispersed discrete data set, you may group them
as explained in the rules of continuous data.
29
General Rules for Forming Frequency Distributions of
Discrete Data
30
General Rules for Forming Frequency Distributions of
Discrete Data
• The sum of two dices is discrete type of data and its histogram
can be drawn directly without grouping in general.
31
Types of Frequency Curves
Frequency curves arising in practice take on certain characteristic
shapes, as shown in Figure.
32
Types of Frequency Curves
1. The symmetrical, or bell–shaped, frequency curves are
characterized by the fact that observations equidistant from the
central maximum have the same frequency. An important
example is the normal curve.
33
Types of Frequency Curves
34
Types of Frequency Curves
35
Types of Frequency Curves
36
Types of Frequency Curves
37
Types of Frequency Curves
The cumulative frequency distribution of the data is shown in
Figure. It is seen that 50% of the rain is below 600 mm.
38
Types of Frequency Curves
The appearance of the frequency histogram is affected by the
number of class intervals. The use of too few classes causes too
much loss of information, whereas too many class intervals may
lead to irregular histograms, with very few observations (or maybe
none) in some intervals. Thus, selection of the number of class
intervals is important.
39
Example 1
Frequency distribution of the monthly salaries of 65 employees
working for P&R company is given in the table.
40
Example 1
Please answer the following according to the data given in table
41
Example 1
a) 3000 TL
b) 2899.99 TL
c) ½(2700+2799.99) =2749.995 TL (Round up to 2750 TL for
practical purposes)
d) Lower class boundary = ½(2900+2899.99) = 2899.995 TL
Upper class boundary = ½(2999.99+3000) = 2999.995 TL
e) 2999.995 - 2899.995 = 100 TL
42
Example 1
f) 16
g) 16/65 = 0.246 = 24.6%
h) 2700 – 2799.99 TL
i) Number of employees earning less than 2800 TL = 16+10+8 = 34
Percent of employees earning less than 2800 TL = 34/65 = 52.3%
j) Number of employees earning less than 3000 TL but at least
2600 TL = 10+14+16+10 = 50
Percent of employees earning less than 3000 TL but at least
2600 TL = 50/65 = 76.9%
43
Example 2
By using the data given in table in Example 1
44
Example 2
a) Calculate the cumulative frequency distribution
45
Example 2
b) Calculate the percent cumulative distribution
46
Example 2
c) Draw the cumulative frequency diagram
47
Example 2
d) Draw the percent cumulative frequency diagram
48
Example 3
By using the data given in table in Example 1 calculate the
cumulative frequency distribution as «more than» and draw its
graph.
49
Example 3
50