Statistics (Unit I)
Statistics (Unit I)
1 Definitions of Statistics
The word statistics defined in two different ways
In the sense of numerical data, the definition given by Horace Secrist is the most exhaustive.
It is as follows: “By Statistics we mean aggregates of facts affected to a marked extent by
multiplicity of causes numerically expressed, enumerated according to reasonable standards
of accuracy, collected in a systematic manner for a pre-determined purpose and placed in
relation to each other.”
„Statistics is a body of methods for making decision in the face of uncertainty on the basis of
numerical data and calculated risk.‟
1
Statistics and Industry:
Statistics is widely used in many industries. In industries, control charts are widely used to
maintain a certain quality level. In production engineering, to find whether the product is
conforming to specifications or not, statistical tools, namely inspection plans, control charts,
etc., are of extreme importance. In inspection plans we have to resort to some kind of
sampling – a very important aspect of statistics.
Statistics and Commerce:
Statistics are lifeblood of successful commerce. Any businessman cannot afford to either by
under stocking or having overstock of his goods. In the beginning he estimates the demand
for his goods and then takes steps to adjust with his output or purchases. Thus statistics is
indispensable in business and commerce. As so many multinational companies have invaded
into our Indian economy, the size and volume of business is increasing. On one side the stiff
competition is increasing whereas on the other side the tastes are changing and new fashions
are emerging. In this connection, market survey plays an important role to exhibit the present
conditions and to forecast the likely changes in future.
Statistics and Economics:
Statistical methods are useful in measuring numerical changes in complex groups and
interpreting collective phenomenon. Nowadays the uses of statistics are abundantly made
in any economic study. Both in economic theory and practice, statistical methods play an
important role. Alfred Marshall said, “ Statistics are the straw only which I like every other
economist have to make the bricks”. It may also be noted that statistical data and techniques
of statistical tools are immensely useful in solving many economic problems such as wages,
prices, production, distribution of income and wealth and so on. Statistical tools like Index
numbers, time series Analysis, Estimation theory, Testing Statistical Hypothesis are
extensively used in economics.
Statistics and Education:
Statistics is widely used in education. Research has become a common feature in all branches
of activities. Statistics is necessary for the formulation of policies to start new course,
consideration of facilities available for new courses etc. There are many people engaged in
research work to test the past knowledge and evolve new knowledge. These are possible only
through statistics.
Statistics and Planning:
Statistics is indispensable in planning. In the modern world, which can be termed as the
“world of planning”, almost all the organisations in the government are seeking the help of
2
planning for efficient working, for the formulation of policy decisions and execution of the
same. In order to achieve the above goals, the statistical data relating to production,
consumption, demand, supply, prices, investments, income expenditure etc and various
advanced statistical techniques for processing, analysing and interpreting such complex data
are of importance. In India statistics play an important role in planning, commissioning both
at the central and state government levels.
3
1.5 Variable
A variable is a quantity which can vary from one individual to another. For example, age,
height, temperature, weight, number of students, number of vehicles, number of children,
income etc.
Quantitative Variable: When variable have numeric value then they are called quantitative
variable.
Qualitative Variable: When variable have non-numeric value such as honesty, beauty,
intelligence etc. then they are called qualitative variable.
Types of variables
Discrete Variable: Variable whose value can be counted is called discrete variable.
For example: number of students in a class, number of children in a family.
Continuous Variable: Variable whose value can be measured is called continuous
variable. In other word, if any variable can take certain value within a range then it is
called continuous variable. For example : if we consider the age of a child ,we
observe that as child grow from 15 year to 17 year (say), his age take all possible
value within these range. Thus age is a continuous variable.
Other examples of continuous variables are height , temperature, weight.
1.6 Data
Collection of facts and figures about a phenomenon is one of the most important functions of
statistics. A businessman, a minister in the government and his secretary or an economist
needs a vast amount of data for taking right decisions. These data should be reliable, accurate
and adequate because if data are unreliable, inaccurate and inadequate the whole data
collection may be faulty and our decision may be misleading. Data may be collected from
primary sources or secondary sources.
Types of data
Primary data: primary data is the data which is collected for the first time by investigator
himself for the purpose of a specific enquiry or study.
Sources from where these data are gathered are known as primary sources.
Method of collecting primary data
Observations
Interviews
4
Questionnaire
Email questionnaire
schedules
Secondary data: Data which are not originally collected by investigator but taken from some
published and unpublished sources are known as secondary data.
Method of collecting Secondary data
Published sources such as annual reports of WHO, RBI bulletin, Five-year plan etc.
Unpublished sources: All statistical material is not always published. There are
various sources of unpublished data such as records maintained by various
Government and private offices, studies made by research institutions, scholars, etc.
Such sources can also be used where necessary.
5
Formation of Continuous Frequency Distribution:-
Continuous frequency distribution refers to groups of values.
The following technical terms are important when a continuous frequency distribution is
formed or a data are classified according to class-intervals-
Class Limit:-
The class limits are the lowest and the highest values that can be included in the class.
For example:-
Take the class 20-40 the lowest value of the class is 20 and the highest 40. The two
boundaries of class are known as the lower limit and the upper limit of the class.
Class Interval:-
Class interval:-
The difference between the upper and lower limit of a class is known as interval of the
class. In this class 100-200, the class interval is 100.
(i.e. 200-100=100)
Class Frequency:-
The number of observation corresponding to a particular class is known as the class
frequency.
For example:
The frequency of the class 1000-1100 is 50 which imply that there are 50 persons
having income between rs. 1000 and rs. 1100.
Class mid-point or class marks:-
The mid value which lies half way between lower and upper class limit is known as
class mark of the class.
Class mark = (upper limit of the class + lower limit of the class)/2
Exclusive Method:-
When class intervals are so fixed that the upper limit of one class is the lower limit of
the next class it is known as the exclusive method of classification.
6
Income (Rs.) No. Of persons
1000-1100 50
1100-1200 100
1200-1300 200
1300-1400 150
1400-1500 40
1500-1600 10
Total 550
Inclusive Method:-
Under the inclusive method of classification the upper limit of one class is includes in
that class itself.
For example:-
Income (Rs.) No. Of persons
1000-1099s 50
1100-1199 100
1200-1299 200
1300-1399 150
1400-1499 40
1500-1599 10
Total 550
7
Histogram:
Histogram is a graphical representation of frequency distribution for continuous series. A
histogram is a set of rectangle whose area is proportional to the frequencies represented. The
height of rectangle shows the frequency of the class and width of rectangle shows the class
intervals.
For making the adjustment we take that class which has lowest class- interval and
adjust the frequencies of other class in the following manners :- If one class- interval
is double then lowest class interval we divide the height of rectangle by 2, if its
three time more then we divide the eight of its rectangle by 3 and so on.
8
10
11
Frequency Ploygon
Frequency polygon is a curve drawn from a histogram. We may drawn histogram of the
given data and then join them by a curve the midpoints of the upper horizontal side of
each rectangle with the adjacent ones. The ends of the curve are extended in both sides
so that they touch the midpoints of the imaginary classes on X-axis. The figure so
formed is called frequency polygon.
Ogive
Ogive is a graphical representation of cumulative frequency distribution. When we construct
ogive the class intervals is always taken on X – axis (horizontal axis) and cumulative
frequency depending on it on the Y – axis (vertical axis). There are two methods for
constructing ogive, namely
The “less than” method:- In the less than method we start with upper limits of the classes;
for the first upper limit we put a dot on Y –axis to show its frequency. For the second upper
limit obtain the cumulative frequency by adding the first and the second class frequency and
mark the cumulative frequency on the graph and repeat this step for all subsequent upper
limits. When these frequencies are plotted we get a rising curve.
The “more than” method:- In the more than method we start with lower limits of the
classes; for the first lower limit select the total frequency, for the next lower limit subtract the
first frequency from total frequency, repeat the procedure for all subsequent lower limits.
When these frequencies are plotted we get a rising curve.
12
13
14