0% found this document useful (0 votes)
12 views

Unit-1 - Data Classification and Tabulation - L-4 - 28th November, 2020

The document discusses data classification and tabulation. It defines classification as arranging data into groups based on common characteristics. The objectives of classification are to simplify data, allow comparisons, and help with analysis. Types of classification include geographical, chronological, qualitative, and quantitative. Tabulation involves arranging data in a table for analysis and inferences. Frequency distribution tables organize data into numerical intervals and classes.

Uploaded by

Manas Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Unit-1 - Data Classification and Tabulation - L-4 - 28th November, 2020

The document discusses data classification and tabulation. It defines classification as arranging data into groups based on common characteristics. The objectives of classification are to simplify data, allow comparisons, and help with analysis. Types of classification include geographical, chronological, qualitative, and quantitative. Tabulation involves arranging data in a table for analysis and inferences. Frequency distribution tables organize data into numerical intervals and classes.

Uploaded by

Manas Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Data Classification

The data collected for the purpose of a statistical inquiry sometimes consists of a
few fairly simple figures, which can be easily understood without any special
treatment. But more often there is mass of raw data without any structure. Thus,
unwieldy, unorganized and shapeless mass of collected data is not capable of being
easily associated or interpreted. Unorganized data are not fit for further analysis
and interpretation. In order to make the data simple and easily understandable the
first task is to condense and simplify them in such a way that irrelevant data are
removed and their significant features are stand out prominently. The procedure
adopted for this purpose is known as method of classification and tabulation.
Classification helps proper tabulation. Classification is the process of arranging
data into sequences and groups according to their common characteristics or
separating them into different but related parts.
Or we can say that the process of grouping large number of individual facts and
observations on the basis of similarity among the items is called classification.

Objectives / purposes of classifications

I) To simplify and condense the large data

ii) To present the facts in easy understandable form

iii) To allow comparisons

iv) To help to draw valid inferences

v) To relate the variables among the data

vi) To help further analysis

vii) To eliminate unwanted data

viii) To prepare tabulation

Page 1 of 10
Guiding principles (rules) of classifications

Following are the general guiding principles for good classifications

a) Exhaustive: Classification should be exhaustive. Each and every item in data


must belong to one of class. Introduction of residual class (i.e. either,
miscellaneous etc.) should be avoided.

b) Mutually exclusive: Each item should be placed at only one class.

c) Suitability: The classification should confirm to object of inquiry.

d) Stability: Only one principle must be maintained throughout the classification


and analysis.

e) Homogeneity: The items included in each class must be homogeneous.

f) Flexibility: A good classification should be flexible enough to accommodate


new situation or changed situations.

Important types of classification

a) Geographical (on the basis of area or region wise)

b) Chronological (with respect to time)

c) Qualitative (on the basis of attributes)

d) Numerical, quantitative (on the basis of magnitude)

a) Geographical Classification

In geographical classification, the classification is based on the geographical


regions.

Page 2 of 10
Ex: Sales of the company (In Million Rupees) (region wise)

Region sales
East 235
West 215
North 265
South 247

b) Chronological Classification

If the statistical data are classified according to the time of its occurrence, the type
of classification is called chronological classification.

Sales reported by a departmental store

Month Sales

(Rs.) in lakhs

January 45

February 63

March 48

April 54

May 56

June 60

Page 3 of 10
July 64

c) Qualitative Classification

In qualitative classifications, the data are classified according to the presence or


absence of attributes in given units. Thus, the classification is based on some
quality characteristics / attributes.

Further, it may be classified as

a) Simple classification

b) Manifold classification

i) Simple classification: If the classification is done into only two classes then
classification is known as simple classification.

ii) Manifold classification: In this classification, the classification is based on


more than one attribute at a time.

d) Quantitative Classification: In Quantitative classification, the classification is


based on quantitative measurements of some characteristics, such as age, marks,
demand, supply, production, sales etc. The quantitative phenomenon under study is
known as variable and hence this classification is also called as classification by
variable.

Page 4 of 10
In this classification marks obtained by students is variable and number of students
in each class represents the frequency.

Tabulation
Tabulation may be defined, as systematic arrangement of data is column and rows.
It is designed to simplify presentation of data for the purpose of analysis and
statistical inferences.

Major Objectives of Tabulation

1. To simplify the complex data

2. To facilitate comparison

3. To draw valid inference / conclusions

4. To help for further analysis

Page 5 of 10
Classification of tables

Classification is done based on

1. Coverage (Simple and complex table)

2. Objective / purpose (General purpose / Reference table / Special table or


summary table)

3. Nature of inquiry (primary and derived table).

Frequency Distribution

Frequency distribution is a table used to organize the data. The left column (called
classes or groups) includes numerical intervals on a variable under study. The right
column contains the list of frequencies, or number of occurrences of each
class/group. Intervals are normally of equal size covering the sample observations
range.

It is simply a table in which the gathered data are grouped into classes and the
number of occurrences, which fall in each class, is recorded.

A frequency distribution can be classified as

a) Series of individual observation

b) Discrete frequency distribution

c) Continuous frequency distribution

a) Series of individual observation

Series of individual observation is a series where the items are listed one after the
each observation. For statistical calculations, these observations could be arranged
is either ascending or descending order. This is called as array.

Page 6 of 10
Discrete (ungrouped) Frequency Distribution

Discrete data is generated by counting each and every observation.. When an


observation is repeated, it is counted. The number for which the observation is
repeated is called the frequency of that observation. The class limits in discrete
data are true class limits; there are no class boundaries in discrete data.

Given below are marks obtained by 20 students in Math out of 25.

21, 23, 19, 17, 12, 15, 15, 17, 17, 19, 23, 23, 21, 23, 25, 25, 21, 19, 19, 19

Page 7 of 10
Continuous frequency distribution (grouped frequency distribution)

Continuous data series is one where the measurements are only approximations
and are expressed in class intervals within certain limits. In continuous frequency
distribution the class intervals are theoretically continuous from the starting of the
frequency distribution till the end without break. i.e., the variable which can take
very intermediate value between the smallest and largest value in the distribution is
a continuous frequency distribution.

The presentation of the above data can be expressed into groups. These groups are
called classes or the class interval.
Each class interval is bounded by two figures called the class limits.

The lower value of a class interval is called lower limit and upper value of that
class interval is called the upper limit. Thus, each class interval has lower and
upper limits.
For Example:
In the class interval 10 - 20, 10 is the lower limit and 20 is the upper limit.

Exclusive form of data:


This above table is expressed in the exclusive form.

Page 8 of 10
In this, the class intervals are (0 – 10), (10 – 20), (20 – 30). In this, we include
lower limit but exclude upper limit.
So, (10 – 20) means values from 10 and more but less than 20.
(20 – 30) would mean values from 20 and more but less than 30.

Data in the inclusive form:


Marks obtained by 20 students in Mathematics are given below.
23, 0, 14, 10, 15, 3, 8, 16, 18, 20, 1, 3, 20, 23, 24, 15, 24, 22, 14, 13
Let us represent this data in the inclusive form.

Here, also we arrange the data into different groups called class intervals, i.e.,
(0 – 10), (11 – 20), (21 – 30).

(0 – 10) means between 0 and 10 including 0 and 10.


Here, 0 is the lower limit and 10 is the upper limit.
(11 – 20) means between 11 and 20 including 11 and 20.
Here, 11 is the lower limit and 20 is the upper limit.
When the data is expressed in the inclusive form, it is converted to exclusive form
by subtracting 0.5 from lower limit and adding 0.5 to upper limit of each class
interval.
(11 – 20) is expressed in the inclusive form which can be changed and taken as
(10.5 - 20.5) which is the exclusive form of the data.
Page 9 of 10
Similarly, (21 – 30) can be taken as (20.5 - 30.5).

By grouping the marks into class interval of 10 following frequency distribution


tables can be formed.

Page 10 of 10

You might also like