0% found this document useful (0 votes)
46 views

Frequency Distribution Lecture 2 3

1. A frequency distribution groups data into classes and records the number of observations in each class. It condenses raw data into a more understandable form. 2. There are two main types of frequency distributions: discrete, for data that can only take on distinct values, and continuous, for data that can vary along a range of values. 3. Key aspects of a continuous frequency distribution include the class limits, which define the lowest and highest values in a class, and the class interval or width, which is the range between lower and upper class limits. The frequency is the number of observations in each class.

Uploaded by

rafid Rafuu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Frequency Distribution Lecture 2 3

1. A frequency distribution groups data into classes and records the number of observations in each class. It condenses raw data into a more understandable form. 2. There are two main types of frequency distributions: discrete, for data that can only take on distinct values, and continuous, for data that can vary along a range of values. 3. Key aspects of a continuous frequency distribution include the class limits, which define the lowest and highest values in a class, and the class interval or width, which is the range between lower and upper class limits. The frequency is the number of observations in each class.

Uploaded by

rafid Rafuu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

FREQUENCY DISTRIBUTION

Introduction:
Frequency distribution is a series when a number of observations with similar or
closely related values are put in separate bunches or groups, each group being in order of
magnitude in a series. It is simply a table in which the data are grouped into classes and the
number of cases which fall in each class are recorded. It shows the frequency of occurrence of
different values of a single Phenomenon.

A frequency distribution is constructed for three main reasons:


1. To facilitate the analysis of data.

2. To estimate frequencies of the unknown population distribution from the distribution


of sample data and

3. To facilitate the computation of various statistical measures


Raw data:
The statistical data collected are generally raw data or ungrouped data. Let us consider
the daily wages (in Rs ) of 30 labourers in a factory.

80 70 55 50 60 65 40 30 80 90
75 45 35 65 70 80 82 55 65 80
60 55 38 65 75 85 90 65 45 75

The above figures are nothing but raw or ungrouped data and they are recorded as they
occur without any pre consideration. This representation of data does not furnish any useful
information and is rather confusing to mind. A better way to express the figures in an ascending
or descending order of magnitude and is commonly known as array. But this does not reduce
the bulk of the data. The above data when formed into an array is in the following form:

30 35 38 40 45 45 50 55 55 55
60 60 65 65 65 65 65 65 70 70
75 75 75 80 80 80 80 85 90 90

The array helps us to see at once the maximum and minimum values. It also gives a
rough idea of the distribution of the items over the range. When we have a large number of
items, the formation of an array is very difficult, tedious and cumbersome. The Condensation
should be directed for better understanding and may be done in two ways, depending on the
nature of the data.
a) Discrete (or) Ungrouped frequency distribution:
In this form of distribution, the frequency refers to discrete value. Here the data are
presented in a way that exact measurement of units are clearly indicated.
There are definite difference between the variables of different groups of items. Each
class is distinct and separate from the other class. Non-continuity from one class to another
class exist. Data as such facts like the number of rooms in a house, the number of companies
registered in a country, the number of children in a family, etc.
The process of preparing this type of distribution is very simple. We have just to count
the number of times a particular value is repeated, which is called the frequency of that class.
In order to facilitate counting prepare a column of tallies.
In another column, place all possible values of variable from the lowest to the highest.
Then put a bar (Vertical line) opposite the particular value to which it relates.
To facilitate counting, blocks of five bars are prepared and some space is left in
between each block. We finally count the number of bars and get frequency.
Example 1:
In a survey of 40 families in a village, the number of children per family was recorded
and the following data obtained.

1 0 3 2 1 5 6 2
2 1 0 3 4 2 1 6
3 2 1 5 3 3 2 4
2 2 3 0 2 1 4 5
3 3 4 4 1 2 4 5

Represent the data in the form of a discrete frequency distribution.


Solution:
Frequency distribution of the number of children

Number of Tally Frequency


Children Marks
0 3
1 7
2 10
3 8
4 6
5 4
6 2
Total 40
b) Continuous frequency distribution:
In this form of distribution refers to groups of values. This becomes necessary in the case
of some variables which can take any fractional value and in which case an exact measurement
is not possible. Hence a discrete variable can be presented in the form of a continuous frequency
distribution.
Wage distribution of 100 employees

Weekly wages Number of

(Rs) employees
50-100 4
100-150 12
150-200 22
200-250 33
250-300 16
300-350 8
350-400 5
Total 100

Nature of class:
The following are some basic technical terms when a continuous frequency distribution
is formed or data are classified according to class intervals.
a) Class limits:

The class limits are the lowest and the highest values that can be included in the
class. For example, take the class 30-40. The lowest value of the class is 30 and highest class is
40. The two boundaries of class are known as the lower limits and the upper limit of the class.
The lower limit of a class is the value below which there can be no item in the class. The upper
limit of a class is the value above which there can be no item to that class. Of the class 60-79, 60
is the lower limit and 79 is the upper limit, i.e. in the case there can be no value which is less
than 60 or more than 79. The way in which class limits are stated depends upon the nature of
the data. In statistical calculations, lower class limit is denoted by L and upper class limit by U.
b) Class Interval:
The class interval may be defined as the size of each grouping of data. For example,
50- 75, 75-100, 100-125… are class intervals. Each grouping begins with the lower limit of a
class interval and ends at the lower limit of the next succeeding class interval
c) Width or size of the class interval:
The difference between the lower and upper class limits is called Width or size of class
interval and is denoted by ‘C’.
d) Range:
The difference between largest and smallest value of the observation is called The Range
and is denoted by ‘R’ ie
R = Largest value – Smallest value

R=L-S
e) Mid-value or mid-point:
The central point of a class interval is called the mid value or mid-point. It is found out
by adding the upper and lower limits of a class and dividing the sum by 2.
LU
(i.e.) Mid value =
2
For example, if the class interval is 20-30 then the mid-value is
20  30
 25
2

f) Frequency:
Number of observations falling within a particular class interval is called frequency of
that class.

Let us consider the frequency distribution of weights of persons working in a company.

Weight Number of
(in Kgs) persons
30-40 25
40-50 53
50-60 77
60-70 95
70-80 80
80-90 60
90-100 30
Total 420

In the above example, the class frequency are 25,53,77,95,80,60,30. The total frequency
is equal to 420. The total frequency indicate the total number of observations considered in a
frequency distribution.
g) Number of class intervals:
The number of class interval in a frequency is matter of importance. The number of
class interval should not be too many. For an ideal frequency distribution, the number of class
intervals can vary from 5 to 15. To decide the number of class intervals for the frequency
distributive in the whole data, we choose the lowest and the highest of the values. The difference
between them will enable us to decide the class intervals.
Thus the number of class intervals can be fixed arbitrarily keeping in view the nature of
problem under study or it can be decided with the help of Sturges’ Rule. According to him, the
number of classes can be determined by the formula

K = 1 + 3. 322 log10N
Where N = Total number of observations

log = logarithm of the number

K = Number of class intervals.


Thus if the number of observation is 10, then the number of class intervals is

K = 1 + 3. 322 log 10 = 4.322  4

If 100 observations are being studied, the number of class interval is

K = 1 + 3. 322 log 100 = 7.644  8 and so on.

Another rule of thumb is that the number of classes should be around √𝒏 where n is the
number of observations in the data.
Actually, there is no hard and fast rule for the number of classes. It depends on the volume and

nature of the data.

h) Size of the class interval:


Since the size of the class interval is inversely proportional to the number of class
interval in a given distribution. The approximate value of the size (or width or magnitude) of
the class interval ‘C’ is obtained by using sturges rule as

Range
Size of class interval = C =
Number of class interval

Range
=
1+3.322log10 N

Where Range = Largest Value – smallest value in the distribution.


Types of class intervals:
There are three methods of classifying the data according to class intervals namely

a) Exclusive method

b) Inclusive method

c) Open-end classes

a) Exclusive method:
When the class intervals are so fixed that the upper limit of one class is the lower limit
of the next class; it is known as the exclusive method of classification. The following data are
classified on this basis.

Expenditure No. of families


(Rs.)
0-5000 60
5000-10000 95
10000-15000 122
15000-20000 83
20000-25000 40
Total 400

It is clear that the exclusive method ensures continuity of data as much as the upper
limit of one class is the lower limit of the next class. In the above example, there are so families
whose expenditure is between Rs.0 and Rs.4999.99. A family whose expenditure is Rs.5000
would be included in the class interval 5000-10000. This method is widely used in practice.
b) Inclusive method:
In this method, the overlapping of the class intervals is avoided. Both the lower and upper
limits are included in the class interval. This type of classification may be used for a grouped
frequency distribution for discrete variable like members in a family, number of workers in a
factory etc., where the variable may take only integral values. It cannot be used with fractional
values like age, height, weight etc.
This method may be illustrated as follows:

Class interval Frequency


5-9 7
10-14 12
15-19 15
20-29 21
30-34 10
35-39 5
Total 70

Thus to decide whether to use the inclusive method or the exclusive method, it is
important to determine whether the variable under observation in a continuous or discrete one.
In case of continuous variables, the exclusive method must be used. The inclusive method
should be used in case of discrete variable.
c) Open end classes:

A class limit is missing either at the lower end of the first class interval or at the
upper end of the last class interval or both are not specified. The necessity of open end
classes arises in a number of practical situations, particularly relating to economic and medical
data when there are few very high values or few very low values which are far apart from the
majority of observations.

The example for the open-end classes as follows :

Salary Range No. of workers


Below 2000 7
2000-4000 5
4000-6000 6
6000-8000 4
8000 and above 3

Construction of frequency table:


Constructing a frequency distribution depends on the nature of the given data. Hence,
the following general consideration may be borne in mind for ensuring meaningful
classification of data.
1. The number of classes should preferably be between 5 and 20. However there is no rigidity
about it.
2. As far as possible one should avoid values of class intervals as 3,7,11,26….etc. preferably
one should have class intervals of either five or multiples of 5 like 10,20,25,100 etc.
3. The starting point i.e the lower limit of the first class, should either be zero or 5 or multiple
of 5.
4. To ensure continuity and to get correct class interval we should adopt “exclusive” method.
5. Wherever possible, it is desirable to use class interval of equal sizes.
Preparation of frequency table:
The premise of data in the form of frequency distribution describes the basic pattern
which the data assumes in the mass. Frequency distribution gives a better picture of the pattern
of data if the number of items is large. If the identity of the individuals about whom a particular
information is taken, is not relevant then the first step of condensation is to divide the observed
range of variable into a suitable number of class-intervals and to record the number of
observations in each class.

Example 1:
Given below are the number of tools produced by workers in a factory.

43 18 25 18 39 44 19 20 20 26
40 45 38 25 13 14 27 41 42 17
34 31 32 27 33 37 25 26 32 25
33 34 35 46 29 34 31 34 35 24
28 30 41 32 29 28 30 31 30 34
31 35 36 29 26 32 36 35 36 37
32 23 22 29 33 37 33 27 24 36
23 42 29 37 29 23 44 41 45 39
21 21 42 22 28 22 15 16 17 28
22 29 35 31 27 40 23 32 40 37

Construct frequency distribution with inclusive type of class interval.


Solution:
Using sturges formula for determining the number of class intervals, we have

Number of class intervals = 1+ 3.322 log10N

= 1+ 3.322 log10100
= 7.6
Range
Sizes of class interval  Number of class interval

46 13

7.6
5

Hence taking the magnitude of class intervals as 5, we have 7 classes 13-17, 18-22…
43-47 are the classes by inclusive type. Using tally marks, the required frequency distribution
is obtain in the following table

Class Interval Tally Marks Number of tools produced


(Frequency)
13-17 6
18-22 11
23-27 18
28-32 25
33-37 22
38-42 11
43-47 7
Total 100

Percentage frequency table:


The comparison becomes difficult and at times impossible when the total number of
items are large and highly different one distribution to other. Under these circumstances
percentage frequency distribution facilitates easy comparability. In percentage frequency table,
we have to convert the actual frequencies into percentages. The percentages are calculated by
using the formula given below:

Actual Frequency
Frequency percentage =  100
Total Trequency

It is also called relative frequency table:


An example is given below to construct a percentage frequency table.

Marks No. of Students Frequency percentage


0-10 3 6
10-20 8 16
20-30 12 24
30-40 17 34
40-50 6 12
50-60 4 8
Total 50 100

Cumulative frequency table:


Cumulative frequency distribution has a running total of the values. It is
constructed by adding the frequency of the first class interval to the frequency of
the second class interval. Again add that total to the frequency in the third class
interval continuing until the final total appearing opposite to the last class interval
will be the total of all frequencies. The cumulative frequency may be downward
or upward. A downward cumulation results in a list presenting the number of
frequencies “less than” any given amount as revealed by the lower limit of
succeeding class interval and the upward cumulative results in a list presenting the
number of frequencies “more than” and given amount is revealed by the upper
limit of a preceding class interval.
Example 2:
Age group Number of women Less than Cumulative More than cumulative
(in years) frequency frequency
15-20 3 3 64
20-25 7 10 61
25-30 15 25 54
30-35 21 46 39
35-40 12 58 18
40-45 6 64 6

Cumulative percentage Frequency table:


Instead of cumulative frequency, if cumulative percentages are given, the
distribution is called cumulative percentage frequency distribution. We can form
this table either by converting the frequencies into percentages and then cumulate
it or we can convert the given cumulative frequency into percentages.
Example 3:

Income No. of family Cumulative Cumulative


(in Rs.) frequency percentage
8
2000-4000 8 8 5.7 =
140
 100
4000-6000 15 23 16.4
6000-8000 27 50 35.7
8000-10000 44 94 67.1
10000-12000 31 125 89.3
12000-14000 12 137 97.9
14000-20000 3 140 100.0
Total 140

You might also like