Chapter 2 Classification and Presentation of Data
Chapter 2 Classification and Presentation of Data
1
Classification
• Classification is grouping of data on the basis of any common
characteristics they may have.
On
Qualitative
basis
On
Quantitative
Data basis
On Time
basis
On
Geographical
basis
2
Data Presentation
The systematically collected data can be
presented by the following ways:
Frequency Distribution
Cumulative Frequency Distribution
Relative Frequency Distribution
Charts
Diagrams
3
Frequency Distribution
Frequency Distribution: Frequency distribution
is a statistical table which shows the set of all
distinct values of the variable arranged in order
of magnitude, either individually or in groups,
with the corresponding frequencies side by side.
According to the nature of the collected data
they can be presented in a frequency table as
i. Individual frequency distribution
ii. Discrete (ungrouped) frequency distribution
iii. Continuous (grouped) frequency distribution
4
Individual Frequency Distribution
• The collected data by the investigator for statistical
enquiry is called raw data and if they are of individual
character then they can presented in an individual
frequency distribution table.
• Ex: The temperature of different days in a certain week
from individual series can be presented as follows:
Days Sun Mon Tue Wed Thu Fri Sat
Temperatures( °C) 32 33 31 34 30 29 35
5
Discrete (ungrouped) frequency distribution
• The frequency distribution is the arrangement of data in the table showing the
frequency with each successive value of a variable occurs.
• Ex: Prepare a frequency distribution for the following marks obtained by 25
students: 15, 16, 16, 17, 18, 18, 17, 15, 15, 16, 16, 17, 15, 16, 16, 15, 16, 16,15,
17, 17, 18, 19, 16, 15.
The construction of discrete frequency table from the given data:
6
Continuous (grouped) frequency distribution
• When the number of items is large and the difference between the highest and
lowest of them is also big, the table becomes lengthy and unwieldy. In such cases,
the data are further considered into groups or classes and presented in the form of
grouped frequency distribution.
• Ex: Suppose that the marks obtained by 40 students are as follows: 32, 48, 37, 48,
32, 39,41, 50, 46, 42, 56, 43, 49, 42, 47, 50, 46, 38, 42, 48, 37,46, 43, 31, 45, 40, 30,
36, 40, 55, 39, 43, 34, 47, 52, 56, 48, 44, 38, 48.
• Here the highest marks obtained is 56 and the lowest is 30. Now we will divide the
groups like 30 – 35, 35 – 40 etc. and the form of the frequency distribution table as
follows.
Marks Tally Bars Frequency
30 – 35 5
35 – 40 7
40 – 45 10
45 – 50 12
50 – 55 3
55 – 60 3
Total 40
7
Contd…
• Group frequency distribution table are of two types:
(i) Exclusive type (ii) inclusive type
• Exclusive (continuous) method: When the data are classified by this
method, the upper limit of a class interval is the same as the lower limit of
the next class as shown in the table.
8
Contd…
• Inclusive (discontinuous) method: when the data are
classified by this method, the upper limit of a class interval
and the lower limit of the next are not the same i.e. there
is gap between upper limit of the class and lower limit of
the next class as shown in the table.
Marks Tally Bars Frequency
10 – 19 5
20 – 29 7
30 – 39 10
40 – 49 12
50 – 59 3
60 – 69 3
Total 40
9
Construction of frequency distribution
The following steps are used for construction of
frequency table:
Step 1: The number of classes should be decided.
Number of classes neither too large nor too small (i. e.
generally 5 to 15). The appropriate number of classes
may be decided by Yale’s formula, which is as follows:
No. of classes(k) = 2.5 x (n)¼
Where ‘n’ is the total no. of observations.
Or can be applied the formula suggested by Struge’s:
No. of classes(k) = 1 + 3.322 Log N
Where N = total no. of observations.
10
Contd…
11
Contd…
12
Contd …
Inclusive method:
Class interval Tally bars Frequency
8 – 14 4
15 – 21 4
22 – 28 8
29 – 35 2
36 – 42 5
43 – 49 7
Total 30
13
Contd…
Exclusive method:
Class interval Tally bars Frequency
7.5 – 14.5 4
14.5 – 21.5 4
21.5 – 28.5 8
28.5 – 35.5 2
35.5 – 42.5 5
42.5 – 49.5 7
Total 30
14
Cumulative frequency distribution
• In a cumulative frequency distribution, the cumulative frequencies(c.f.) are
obtained by the cumulation (successively adding) of the frequencies of the
successive individual class intervals. The cumulative frequency of a given class
interval thus represents the total of all the previous class frequencies including
the class against which it is written. A frequency distribution showing the
cumulative frequencies against values of the variables systematically arranged
in increasing or decreasing order is known as cumulative frequency
distribution. We may construct a cumulative frequency distribution table in
two ways. (i) Less than method (ii) More than method. To illustrate the
method, let us take the following ordinary table
Marks obtained Number of students
0 – 10 4
10 – 20 21
20 – 30 28
30 – 40 10
40 – 50 7
Total 70
15
Contd….
• Less than Method: When we cumulate the frequencies from the top, we take
the upper limits of the class intervals and write the word ‘below or less than’
before them. By the less than method the above example can be represents as:
• The number of students who gets less than 20 marks = 4 + 21 = 25. Similarly,
the number of students who gets less than 30 marks = 25 + 28 = 53.
proceeding in this way, we can prepare the cumulative frequency table as
shown below:
Less than 20 4 + 21 = 25 25
Less than 30 25 + 28 = 53 53
Less than 40 53 + 10 = 63 63
Less than 50 63 + 7 = 70 70
16
Contd…
More than 10 45 + 21 = 66 66
More than 20 17 + 28 = 45 45
More than 30 7 + 10 = 17 17
More than 40 7 7
17
2.6.2 Relative Frequency Distribution
• The frequency table is the summary of the original data. But, if a
person would like to know the proportion or the percentages of cases
in each group, instead of simply the number of cases in each group,
then relative frequency distribution table should be drawn as shown
below:
Marks Number of Cumulative Relative cumulative
obtained students frequency frequency
0 – 10 4 4 4/70 = 0.06
10 – 20 21 4 + 21 = 25 21/70 =0.30
20 – 30 28 28 + 25 = 53 28/70 =0.40
30 – 40 10 10 + 53 = 63 10/70 =0.14
40 – 50 7 7 + 63 = 70 7/70 =0.10
Total 70
18
Charts and Diagrams
• Diagrams presents the data in a simple and interesting way and
it is easy to understand them.
• Diagrams look attractive and arouse the interest of the readers.
• Diagram have visual appeal and hence are quite impressive.
They are not only remember but impression left by them on
the mind lasts much longer than that left by figures in a table.
• Diagrams help us in making quick comparison of data relating
to different time and places.
• Diagram helps to study the relation between two or more sets
of data easily and quickly.
• Diagram saves a lot of trouble and time. Figures are not easily
understood. One must make an effort to grasp their meaning
and draw proper conclusion from them. They give a clear
picture of the data at a single glance and no time to effort is
lost. 19
Types of Diagrams
• There are various types of diagrams. We will
study the following which will be most used in
Statistics:
i. Simple bar diagram
ii. Multiple bar diagram
iii. Sub-divided bar diagram
iv. Percentage bar diagram
v. Pie chart (circular bar diagram)
20
Contd…
i. Simple Bar Diagram: It is one of the simplest and most popular diagram in Statistics.
It consists of bars or rectangles of equal width. The lengths of the bars represents the
different values of the variables. The bars may be vertical or horizontal. Bur we
generally use vertical bars may more attractive and easier to compare.
Ex: Represent by a simple bar diagram the following production of sugar of a
particular factory in nine different years.
Years 2003 2004 2005 2006 2007 2008 2009 2010 2011
Production (in 0.8 1.4 3.2 4.6 5.8 4.8 3.7 7 6.5
100 tons)
Production in Tons
8
7
7 6.5
6 5.8
5 4.6 4.8
Production in Tones
4 3.7
3.2
3
2 1.4
1 0.8
0 21
2003 2004 2005 2006 2007 2008 2009 2010 2011
Contd…
• Multiple Bar Diagram: We use multiple bar diagrams two or more sets of
related data. The different bars of each set are placed together and different
colors or shades are used to distinguish bars of one type from the other.
• Construct a multiple bar diagram to present given data below:
2008 20 80 100 20
22
Contd…
Multiple bar diagram of the above information
250
200
200
175 dist.
I st
150 div
150 2 nd
130
120 div
3 rd
100 100 div
100
80
50 50
50 40
30 25 30
20 20
0
2008 2009 2010 2011
23
Contd…
• Sub- Divided Bar Diagram: We use it when we have
to divide the total magnitudes of the variables into
different parts:
2008 20 50 60 30 25
2009 50 65 70 50 45
24
Contd…
• Sub divided bar diagram of the above information
700
600
90
failed
500
100 3rd div
400 75
2 nd
80 170 div
300
45 1 st div
120
200 50
25 150 dist
30 70
100
100 60
65
50 75 100
20 50
0
2008 2009 2010 2011
25
Contd…
• Percentage Bar Diagram: Here the sum total of the values is taken as 100 and the
values of each of the components is reduced to the percentage of the whole. So the
height of each of the bars in a particular bar diagram will represent 100 and hence all
the bars will be of equal length.
• Ex: the expenditure of a family in different items is given below. Represent it by
percentage subdivided bar.
Items food clothing education fuel Misc. total
Expenditure 8000 5000 4000 2000 1000 20000
(in Rs)
26
The Percentage bar diagram of the above example as follows:
100 Misc. 5%
90 fuel 10%
80
Education 20%
70
Percentage
60
Clothing 25%
50
40
30
Food 40%
20
10
0
27
Pie Chart:
• Like rectangles we can take a circle and divide it into several
parts to represent the total magnitude and the various parts
into it is broken up. We draw a circle to represent the whole
and divide it into sectors to represent each of its
components. Such a diagram is called a pie diagram,
because the sectors look like the slices of pie. It is also
known as angular diagram. To draw the pie diagram we
draw a circle of suitable radius and suppose that the total
magnitude is represented by its area. Since the angle at the
centre of the circle is 360°. Then we express the component
values in terms of the angles which the corresponding
sectors make at the centre. The angles can be calculated by
using the relation:
28
Contd…
29
Contd…
• We take the total 900 for 360° and calculate the expenses on different items in
degrees as shown below.
Items Expenditure Degree
30
Three dimensional Pie-chart of the above example
72
cemen
54
t
54
timber
36
bricks
labor
90 54 steel
misc.
31
Histogram
A Histogram constructed from a frequency distribution of
group data consists a series of rectangle with no gap
between them. The bases of rectangles are on the x-axis and
their areas represents the frequencies of the corresponding
classes.
Example: Prepare a histogram from the following data given
below:
Class 0-6 6 - 12 12 - 18 18 - 24 24 – 30 30 - 36
Frequency 4 8 15 20 12 6
32
The histogram of the above frequency distribution was as follows:
Histogram
25
20
20
15
15
12
10
8
6
5 4
0
0-6 6-12 12-18 18-24 24-30 30-36
33
Frequency Polygon
• The histogram and frequency polygon of the above frequency
distribution was as follows:
Histogram with Frequency Polygon
25
20
20
15
15
frequency
12
10
8
6
5 4
0
0-6 6-12 12-18 18-24 24-30 30-36
34
Frequency Polygon
20
20
15
15
frequency
12
10
8
6
5 4
0
0-6 6-12 12-18 18-24 24-30 30-36
35
Ogives Curves
• Draw the less than Ogives curve from the following data:
Marks 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60 60 - 70
No. of 4 6 10 20 18 2
Students
60
50
40
No. of stu
30
20
10
0
less than lessthan less than less than less than less than
20 30 40 50 60 70 37
More than Ogives curve
• Again we have to construct More than
Cumulative frequencies from the given
distribution which are as shown in the table:
Marks No. of Students
More than 10 60
More than 20 56
More than 30 50
More than 40 40
More than 50 20
More than 60 2
38
More than Ogives curve
• The above data can be presented in a more than Ogives as
follows:
More than Ogives Curves
70
60
50
40
30
No of Stu.
20
10
0
39
Less than and more than Ogives Curves
• If less than and more than Ogives curves plotted together in the same
graph sheet then the intersection point of them is the median of that
data.
70
60
50
40
No. of stu
30
n
a
di
e
20 M No of Stu.
10
0
...
.
.
.
t..
t..
t..
t..
t..
th
ss
ss
ss
ss
ss
ss
le
le
le
le
le
le
40
Assignment 1
QN1. Represent by a simple bar diagram the following production of sugar of
a particular factory in nine different years.
Years 2003 2004 2005 2006 2007 2008 2009 2010 2011
Sales (in lakhs) 5 10 15 21 25 35 40 32 42
QN2. Represent the following data in multiple bar diagram.
Years Distinction First Second Third Failed
2065 30 80 100 70 50
2066 40 100 120 80 40
2067 50 150 150 110 60
2068 60 200 175 120 75
42
Contd…
QN 8: Construct a continuous frequency distribution table for the
following data taking suitable class interval of 30 observations: 25,
32, 45, 8, 24, 42, 22, 12, 9, 15, 26, 35, 23, 41, 47, 18, 44, 37, 27, 46,
38, 24, 43, 46, 10, 21, 36, 45, 22, 18.
QN 9: Construct a continuous frequency distribution table for the
following data taking suitable class interval of 40 observations: 138,
168, 102, 164, 126, 145, 172, 142, 144, 163, 150, 125, 145, 157, 165,
128, 146, 146, 147, 147, 136, 135, 148, 153, 150, 138, 135, 132, 119,
156, 149, 154, 158, 173, 140, 142, 152, 140, 144, 135.
QN 10: In a survey, it was found that 64 families bought milk in the
following quantities in a particular month. Using Sturge’s rule,
convert the data in to a frequency distribution by Inclusive method:
19, 6, 10, 14, 13, 22, 15, 16, 24, 24, 23, 36, 21, 27, 22, 16, 20, 25, 11,
32, 17, 9, 18, 21, 34, 26, 21, 21, 22, 7, 10, 22, 11, 31, 12, 17, 7,5, 37,
17, 39, 20, 18, 33, 30, 16, 19, 25, 28, 23, 13, 23, 14, 28, 24, 26, 8, 12,
23, 18, 20, 29, 15, 9.
43