Statistics - 1: Presentation of Data
Statistics - 1: Presentation of Data
Presentation of Data
Tabulation
The process of placing classified data into tabular form is known as tabulation. A table is a symmetric arrangement of statistical data in rows and columns. Rows are horizontal arrangements whereas columns are vertical arrangements. It may be simple, double or complex depending upon the type of
A statistical table has at least four major parts and some other minor parts.
The Title The Box Head (column captions) The Stub (row captions) The Body Prefatory Notes Foots Notes Source Notes
Frequency Distribution
A frequency distribution is a tabular arrangement of data into classes according to the size or magnitude along with corresponding class frequencies (the number of values fall in each class). The word 'frequency' means 'how often'.
Class Interval
The data is grouped into class intervals if the frequency table becomes too large to help us organise, interpret and analyse the data. The frequency of a class interval is the number of data values that fall in the range specified by the interval. The size of the class interval is often selected as 5, 10, 15 or 20 etc. Each class interval starts at a value that is a multiple of the size. For example, if the size of the class interval is 5, then the class intervals should start at 0, 5, 10, 15, 20 etc. The class intervals will then be 0-4, 5-9, 10-14 etc.
Class Limits
Each class is described by two numbers. These no. are called class limits; the smaller no. is called the lower class limit and the larger no. is upper class limit.
Mid Point of Class Interval - It is a value within a class interval, esp. its midpoint or the nearest integral value, used to represent the interval for computational convenience Width of Class Interval - is the difference between the lower endpoint of an interval and the lower endpoint of the next interval. Thus, if our study's continuous intervals are 0 to 4, 5 to 9, etc., the width of the first five intervals is 5.s
Frequency Density
Consider the below example: The frequency density is a the frequency of values divided by the class width of values.
Relative Frequency
The relative frequency density of the occurrence of an event is the score divided by the total number of observations. For example: If the lower extreme of the class you are measuring the density of is 15 and the upper extreme of the class you are measuring is 30, given a relative frequency of 0.0625, you would calculate the frequency density for this class to be: Relative frequency / (Upper extreme of class lower extreme of class) = density0.0625 / (30 15) = 0.0625 / 15 = 0.0041666.. That is: 0.00417 to 5 decimal places.
Diagrammatic Representation
Diagrams may be one dimensional or two dimensional. In one dimensional we have Bar Diagrams. In two dimensions we have pie diagram. Different Bar diagrams are simple bar diagram, component Bar diagram, subdivided Bar diagram, Percentage Bar diagram
It
is drawn when items are to be compared with respect to a single characteristics. A rectangular bar is constructed with height proportional to the magnitude of the items. Example: Represent the following data regarding the yield per acre of paddy in Karnataka over the last five years.
Year 2001 2002 2003 2004 2005
Yield
20
22
25
27
30
40 45 55
70 85 90
20
10 0 2002-2003 2003-2004 2004-2005
100% 90%
80%
Course
No of Students Sec A Sec B B.E 10 5 M.Tech 15 10 MBBS 10 15 B.Com 35 30 BBM 30 40 Total 100 100
70% 60% 50% 40% 30% 20% 10% 0% Sec A No of Students Sec B Total BBM B.Com
MBBS
M.Tech B.E
180 90 45 45 360
1000
Krishna
1500
Food Rent
1000
4000
Fuel Misc
1500
Graphical Representation
Graphs are used mainly for frequency distributions. Some types of graphs: Histogram Frequency Polygon Frequency Curve Ogives [Cumulative Frequency Curves]
Histogram
The frequency distribution is represented by a set of rectangular bars with area proportional to class frequency. If the class intervals have equal width then the variable is taken along Xaxis and frequency along Yaxis and a rectangle is constructed.
Histogram
No of People
16 14
12
10 Age 8 6 4 2 0
0-10
10 20
20-30 Frequency
30-40
40-50
Histogram
In a histogram, we join the upper left corner of highest rectangle to the right adjacent rectangles left corner and right upper corner of the highest rectangle to the left adjacent rectangles right corner. From the intersecting point of these lines we draw a perpendicular to the X-axis. The X reading at that point gives the
Histogram
The class intervals are not equal so: Divide the class interval into two equal class intervals. Calculate the adjusted frequencies by dividing the frequency of that class interval by 2
Histogram
Age Adjusted Frequenc y 5 10 10 0-10 10-20 20-30
30-40
40-50 50-60 60-70 70-80 80-90
15
15 15 12 8 8
Frequency Polygon
The mid values of class intervals are plotted against frequency of the class interval. These points are joined by straight lines. Consider the example used in the 1st problem of Histogram.
Frequency Polygon
Age No of People
16 14 12 10 8 6 4 2 0 0-10 10 20 20-30 30-40 40-50
0-10 5
Frequency Polygon
No of People
Frequency Curve
First we draw histogram for the given data. Then join the mid points of the rectangles by a smooth curve. Total area under frequency curve represents total frequency. They are the most useful form of frequency distribution.
Frequency Curve
Age No of People 0-10 5 10-20 20-30 30-40 40-50 10 15 12 8
Ogives
Less than ogive: Variables are taken along x-axis and less than cumulative frequencies are taken along Y-axis. Less than cumulative frequencies are plotted against upper limit of class interval and joined by a smooth curve. More than Ogive: More than cumulative frequencies are plotted against lower limit of class interval and joined by a smooth curve. From the meeting point of these two ogives if we draw a perpendicular to X-axis, the point where it meets X-axis gives Median of the distribution.
Ogives
Example: Construct an ogive from the data below and determine the median
Ogives