Unit 2
Unit 2
2.1 INTRODUCTION
In the previous unit, we discussed the various ways of collecting data. The
successful use of the data collected depends to a great extent upon the manner
in which it is arranged, displayed and summarised. In this unit, we shall be
mainly interested in the presentation of data. Presentation of data can be
displayed either in tabular form or through charts. In the tabular form, it is
necessary to classify the data before the data is tabulated. Therefore, this unit
is divided into two section, viz., (a) classification of data and (b) charting of
data.
Activity A
What do you understand by classification of data?
Why classification is necessary?
……………………………………………………………………………….
……………………………………………………………………………….
……………………………………………………………………………….
……………………………………………………………………………….
……………………………………………………………………………….
……………………………………………………………………………….
20
Activity B Presentation of
Data
With the help of a suitable example, illustrate the difference between
qualitative and quantitative data.
……………………………………………………………………………….
……………………………………………………………………………….
……………………………………………………………………………….
……………………………………………………………………………….
3 2 2 1 3 4 2 1 3 4 5 0 2
1 2 3 3 2 1 1 2 3 0 3 2 1
4 3 5 5 4 3 6 5 4 3 1 0 6
5 4 3 1 2 0 1 2 3 4 5
To condense this data into a discrete frequency distribution, we shall take the
help of 'Tally' marks as shown below:
This value so obtained is deducted from all lower limits and added to all
upper limits. For instance, the example discussed for inclusive method can
easily be converted into exclusive case. Take the difference between 25 and
24,999 and divide it by 2. Thus correction factor becomes (25-24,999)/2 =
0.0005. Deduct this value from lower limits and add it to upper limits. The
new frequency distribution will take the following form:
Presentation of Data
23
Data Collection
and Analysis
2.7 GUIDELINES FOR CHOOSING THE
CLASSES
The following guidelines are useful in choosing the class intervals.
1) The number of classes should not be too small or too large. Preferably,
the number of classes should be between 5 and 15. However, there is no
hard and fast rule about it. If the number of observations is smaller, the
number of classes formed should be towards the lower side of this limit
and when the number of observations increase, the number of classes
formed should be towards the upper side of the limit.
2) If possible, the widths of the intervals should be numerically simple like
5, 10, 25 etc. Values like 3, 7, 19 etc. should be avoided.
3) It is desirable to have classes of equal width. However, in case of
distributions having wide gap between the minimum and maximum
values, classes with unequal class interval can be formed like income
distribution.
4) The starting point of a class should begin with 0, 5, 10 or multiples
thereof. For example, if the minimum value is 3 and we are taking a class
interval of 10, the first class should be 0-10 and not 3-13.
5) The class interval should be determined after taking into consideration the
minimum and maximum values and the number of classes to be formed.
For example, if the income of 20 employees in a company varies between
Rs. 1100 and Rs. 5900 and we want to form 5 classes, the class interval
should be 1000
5900 − 1100
= 4.8 �� 5
1000
All the above points can be explained with the help of the following example
wherein the ages of 50 employees are given:
22 21 37 33 28 42 56 33 32 59
40 47 29 65 45 48 55 43 42 40
37 39 56 54 38 49 60 37 28 27
32 33 47 36 35 42 43 55 53 48
29 30 32 37 43 54 55 47 38 62
In order to form the frequency distribution of this data, we take the difference
between 60 and 21 and divide it by 10 to form 5 classes as follows:
If we keep on adding the successive frequency of each class starting from the
frequency of the very first class, we shall get cumulative frequencies as
shown below:
25
Data Collection Monthly salary (Rs.) No. of employees Cumulative
and Analysis
1000-1200 5 5
1200-1400 14 19
1400-1600 23 42
1600-1800 50 92
1800-2000 52 144
2000-2200 25 169
2200-2400 22 191
2400-2600 7 198
2600-2800 2 200
Total 200
Bar Diagram
27
Data Collection Take the years on the X-axis and the population figure on the Y-axis and
and Analysis draw a bar to show the population figure for the particular year. As can be
seen from the diagram, the gap between one bar and the other bar is kept
equal. Also the width of different bars is same. The only difference is in the
length of the bars and that is why this type of diagram is also known as one
dimensional.
Histogram. One of the most commonly used and easily understood methods
for graphic presentation of frequency distribution is histogram. A histogram
is a series of rectangles having areas that are in the same proportion as the
frequencies of a frequency distribution.
To construct a histogram, on the horizontal axis or X-axis, we take the class
limits of the variable and on the vertical axis or Y-axis, we take the
frequencies of the class intervals shown on the horizontal axis. If the class
intervals are of equal width, then the vertical bars in the histogram are also of
equal width. On the other hand, if the class intervals are unequal, then the
frequencies have to be adjusted according to the width of the class interval.
To illustrate a histogram when class intervals are equal, let us consider the
following example.
Daily sales No. of Daily sales No. of
(Rs. thousand) companies (Rs. thousand) companies
10-20 15 50-60 25
20-30 22 60-70 20
30-40 35 70-80 16
40-50 30 80-90 7
In this example, we may observe that class intervals are of equal width. Let
us take class intervals on the X-axis and their corresponding frequencies on
the Y-axis. On each class interval (as base), erect a rectangle with height
equal to the frequency of that class. In this manner we get a series of
rectangles each having a class interval as its width and the frequency as its
height as shown below:
Histogram with Equal Class Intervals
28
It should be noted that the area of the histogram represents the total Presentation of
Data
frequency as distributed throughout the different classes.
When the width of the class intervals are not equal, then the frequencies must
be adjusted before constructing the histogram.
The following example will illustrate the procedure:
Income (Rs.) No. of Income (Rs.) No. of
employees
1000-1500 5 3500-5000 12
1500-2000 12 5000-7000 8
2000-2500 15 7000-8000 2
2500-3500 18
As can be seen, in the above example, the class intervals are of unequal width
and hence we have to find out the adjusted frequency of each class by taking
the class with the lowest class interval as the basis of adjustment. For
example, in the class 2500-3500, the class interval is 1000 which is twice the
size of the lowest class interval, i.e., 500 and therefore the frequency of this
class would be divided by two, i.e., it would be 18/2 = 9. In a similar manner,
the other frequencies would be obtained. The adjusted frequencies for various
classes are given below:
Income (Rs.) No. of Income (Rs.) . No. of
employees employees
1000-1500 5 4000-4500 4
1500-2000 12 4500-5000 4
2000-2500 15 5000-5500 2
2500-3000 9 5500-6000 2
3000-3500 9 6000-6500 2
3500-4000 4 6500-7000 2
7000-7500 1
7500-8000 1
The histogram of the above distribution is shown below:
Histogram with Unequal Class Intervals
15
15
12
Number of Employees
10 9
5 5
4
2
1
35
35
30
30
Number of Companies
25
25
22
20
20
15 16
15
10
7
0 10 20 30 40 50 60 70 80 90 100
Daily Sales (In Rupees)
30
Frequency Curve Presentation of
Data
35
30
Number of Companies
25
20
15
10
0 10 20 30 40 50 60 70 80 90 100
Daily Sales (In Rupees)
32
Presentation of
Data
The shape of less than ogive curve would be a rising one whereas the shape
of more than ogive curve should be falling one.
The concept of ogive is useful in answering questions such as: How many
companies are having sales less than Rs. 52,000 per day or more than Rs.
24,000 per day or between Rs. 24,000 and Rs. 52,000?
Activity G
With the help of an example, explain the concept of less than ogive and more
than ogive.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
2.10 SUMMARY
Presentation of data is provided through tables and charts. A frequency
distribution is the principal tabular summary of either discrete or continuous
data. The frequency distribution may show actual, relative or cumulative
frequencies. Actual and relative frequencies may be charted as either
histogram (a bar chart) or a frequency polygon. Two graphs of cumulative
frequencies are: less than ogive or more than ogive.
34
Form a continuous frequency distribution after selecting a suitable class Presentation of
Data
interval.
8) Draw a histogram and a frequency polygon from the following data:
Marks No. of students Marks No. of students
0-20 8 60- 80 12
20-40 12 80-100 3
40-60 15
9) Go through the following data carefully and then construct a histogram.
Income (Rs.) No. of Income (Rs.) No. of
Persons persons.
500 1000 18 3000-4500
1000-1500 20 4500-5000 12
1500-2500 30 5000-7000 5
2500-3000 25
10 The following data relating to sales of 100 companies is given below:
Sales No. of Sales No. of
(Rs. lakhs) companies (Rs. lakhs) companies
5-10 5 25-30 18
10-15 12 30-35 15
15-20 13 35-40 10
20-25 20 40-45 7
Draw less than and more than 0 gives. Determine the number of companies
whose sales are (i) less than Rs.13 lakhs (ii) more than 36 lakhs and (iii)
between Rs. 13 lakhs and Rs. 36 lakhs.
35