CHAPTER 4 ORGANISATION OF DATA
CHAPTER 4 ORGANISATION OF DATA
MEANING OF CLASSIFICATION
Classification is the process of arranging data into sequences and groups according to their common characteristics.
OR
CLASSIFICATION is the Process of arranging data into sequences and groups according to their common characteristics
• Under classification, on the basis of chosen characteristics, the similarity and dissimilarity in the various items are noted and
items exhibiting similarity are grouped together in one class. Through classification, we try to strike a note of homogeneity in
the heterogeneous elements of the collected information.
Objectives of Classification
1. Brief and simplify: In classification, the aim is to eliminate unnecessary details and convert the huge mass of complex data
into simple, condensed, logical & comprehensible form. It helps in highlighting the significant features of the data. For
example, the huge and fragmented data collected during a population census has to be classified according to gender, marital
status, education, occupation, etc., to ascertain the structure and nature of the population.
2. To explain similarity and dissimilarity of Data: Classification facilitates the grouping of data according to certain similarities
and dissimilarities. This enables the investigators to grasp them easily. Facts like educated and uneducated, married and
unmarried, employed and unemployed, etc. are kept in separate classes.
3. To facilitate comparisons: Classification enables us to make meaningful comparisons, draw inferences and locate fact.
4. To study the relationships: Classification helps in finding out cause and effect relationship based on some criteria between
the data.
For example, the characteristics of income and education can be related after classifying the mass of data.
5. To prepare the data for tabulation: Only classified data can be presented in tabular form. Classification, thus provides a
basis for tabulation and further statistical processing.
6. Scientific Arrangement: Classification facilitates arrangement of data in a scientific manner which increase their accuracy
and reliability
Requisites of a Good Classification
A good classification must possess the following features:
1. Suitability: The classification should conform to the object of the enquiry. For example, if investigation is conducted to
inquire into the economic conditions of workers, then it will be of no use to classify them on the basis of their religion.
2. Unambiguous: The classification should not lead to any confusion. It should not be difficult to place units into different
groups according to their common characteristics.
3. Flexibility: A good classification should be capable of being adjusted according to the changed situations and conditions.
4. Mutually Exclusive: The classes must not overlap so that an observed value belongs to one and only one of the classes.
There must be no item which can find its way into more than one class.
5. Stability: The principle of classification, once decided, should remain same throughout the analysis, otherwise it will not be
possible to get meaningful results.
6. Homogeneity: A classification is said to be homogeneous if similar items are placed in a class. All units belonging to a group
should exhibit similar characteristics.
METHODS OF CLASSIFICATION
Statistical data is classified after taking into account the nature, scope, and purpose of an investigation. Generally, data is
classified on the following four basis (see chart):
METHODS Of CLASSIFICATION
CONCEPT OF VARIABLE
A variable refers to quantity or characteristic whose value varies from one investigation to another.
Examples:
(i) "Price" is a variable as prices of different commodities is different.
(ii) "Age" is a variable as age of different students varies.
(iii) Similarly, some more variables are: Height, Weight, Wages, Expenditure, Imports, Production, etc.
It may be noted that different variables are measured in different units. For example, age is measured in years, height in
inches or centimeters, weight in kgs, income in rupees etc.
Variable Vs Attribute
'Variable' is generally taken as anything that changes or varies over a period of time. But, in statistics, only that change is taken
as a variable which can be numerically expressed, such as length, height, width, temperature, etc.
Thigs which cannot be measured numerically such as intelligence, beauty, efficiency, aptitude, etc. are called 'Attributes'.
Variables are of two kinds:
(i) Discrete Variable (Discontinuous Variable).
(ii) Continuous Variable.
Discrete Variable (Discontinuous Variable)
Variables which are capable of taking only exact or finite value and generally not any fractional value are termed as discrete
variables. In other words, discrete variables are expressed in terms of complete numbers.
For example, number of workers or number of students in a class are discrete variable as they cannot be in fractions. Similarly,
number of members in a family can be 1,2 or so on, but cannot be 1.5,2.75. Some other examples can be population of a
town, number of rooms in a house, total number of mobiles in a family, etc.
Continuous Variable
Those variables which can take all the possible values (integral as well as fractional) in a given specified range are termed as
continuous variables. In such a case, data is obtained by measurement.
• For example, Temperature is a continuous variable because it can take any value in the range of measurement, like 20°C or
20.1°C or 20.2°C or 20.5°C and so on.
Discrete Variable Vs Continuous Variable
Basis Discrete Variable Continuous Variable
Meaning Discrete variable is a variable which is Continuous variable is a variable which
capable of taking only exact value and can take all the possible values (integral as
generally not any fractional value. well as fractional) in a given specified
range.
Change in Values These variables increase in complete These variables can increase in fractions
numbers. as well as in complete numbers.
Data Collection In case of discrete variable, data is In case of continuous variable, data is
obtained by counting. obtained by measurement.
Example Number of workers or number of students Height or weight of individuals, are
in a class are discrete variables as they continuous variables as they can be in
cannot be in fractions. fractions.
STATISTICAL SERIES
The arrangement of classified data in some logical order, like according to the size, according to the time of occurrence or
according to some other measurable or non-measurable characteristics, is known as Statistical Series.
• Statistical series are prepared to present the collected and classified data in a properly arranged way.
• For example, if data pertaining to marks of 35 students in a class are arranged according to their roll numbers, then it can be
called statistical series.
Exclusive Series
The classes of the type 10-20, 20-30, 30-40, etc., wherein the upper limit of one class-interval becomes the lower limit of
the next class, are known as exclusive classes. Such classification ensures continuity of data because the upper limit of one
class is the lower limit of succeeding class.
TYPES OF CONTINUOUS SERIES
Exclusive Series
(Classes of type 10-20,20-30, etc.)
Inclusive Series
(Classes of type 10-19,20-29, etc.)
Open-End Distribution (Lower limit of first class and upper limit of last class is
not given)
Example: Below 5, below 10, below 15
Cumulative Frequency Series
(Less than and More than Series)
Mid-value Series
(Middle values of a class-interval are given)