Chapter 2-Stat I Ppt
Chapter 2-Stat I Ppt
Chapter Two
Data Collection & Presentation
BY: Girma.M(MBA)
By: Girma M
Data collection
Definition of data:
Data are the facts and figures collected and summarized for
Girma.M.2021
Data collection...
Gathering of information systematically and in meaningful manner
Girma.M.2021
Scope of the data collection
Girma,2021
Classification of data
Bases of classification
Qualitative Classification: when data are arranged according to
attributes like color, religion, marital-status, sex etc.
Quantitative Classification: Classification of data which can be
measured or counted in definite terms.
Geographical Classification: Data are arranged according to places
like continents, regions, and countries
Chronological Classification: arranged according to time like year..
Girma.M.2021
Types of data
Based on the nature of the data set or variables:
Qualitative Variables: are nonnumeric variables and can't be
measured. E.g. Gender, Religious, Attitude etc
Quantitative Variables: are numerical variables and can be
measured. E.g. Number of bedrooms in your house
Depending upon the sources utilized:
Primary Data
Secondary Data
Girma.M.2021
Primary data
It is the one, which is collected by the investigator or user himself
directly from the source.
first-hand information is gathered.
Such data is original in character and is generated by survey
conducted by concerned body.
Secondary data
Are those data which have been already collected & analyzed by
some earlier for its own use.
A source, which is not primary
used when it impossible to collect first-hand information .
Are data compiled from published and unpublished sources or files.
Girma.M.2021
Method of data collection
Girma.M.2021
Presentation of data
Presentation is a statistical procedure of arranging and putting data
in a form of Tables, Graphs and Diagrams.
The presentation of data is broadly classified in to the following
categories:
Tabular presentation
Diagrammatic Presentation and Graphic presentation.
Girma.M.2021
Tabular presentation of data
Girma.M.2021
Advantages of tabulation
Girma.M.2021
Frequency distribution
The number of times the values of a variable in a data occurs is
called the frequency .
A tabular representation of values of a variable together with the
corresponding frequencies is called a frequency distribution (FD)
Girma.M.2021
Categorical FD
Procedure to construct CFD
Step 1: Make a table &categorize the data in into respective class.
Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Step 4: Find the percentages of values in each class by using; %=
f/ n * 100
Where f= frequency of the class, n=total number of value.
Step 5: Find the total for column (3) and (4).
NB: Percentages are not normally a part of frequency distribution
Girma.M.2021
Example
a social worker collected the following data on marital status for 25
persons : (M=married, S=single, W=widowed, D=divorced)
Girma.M.2021
By following the procedures we can come up with the following solution
Girma.M.2021
Ungrouped frequency distribution
Girma.M.2021
Constructing UFD
Steps in constructing a discrete ungrouped frequency distribution
4. Write the possible values of the variable in ascending order in the first
column.
Girma.M.2021
Example
The following data represent the number of books read by 20 students.
Girma.M.2021
Solution
Girma.M.2021
Grouped frequency distribution
Definitions of terms
Class limits: Separates one class in a grouped frequency distribution
from another.
Units of measurement (U): the distance between two possible
consecutive measures. It is usually taken as 1, 0.1, 0.01, 0.001
Class mark (Mid points): it is the average of the lower and upper
class limits or the average of upper and lower class boundary.
Girma.M.2021
Defnition of terms...
Class boundaries: Separates one class in a grouped frequency
distribution from another.
The boundaries have one more decimal places than the row data and
therefore do not appear in the data.
The lower class boundary is found by subtracting U/2 from the
corresponding lower class limit
The upper class boundary is found by adding U/2 to the
corresponding upper class limit.
Class width: the difference between the upper and lower class
boundaries of any class, difference between limits, the difference
between any two consecutive class marks
Girma.M.2021
Example
Construct a frequency distribution for the following data
Solutions:
Step 2: Find the highest and the lowest value H=39, L=6
Girma.M.2021
Solution…
Step 5: Find the class width; W=R/k=33/6=5.5=6 (rounding up)
Step 6: Select the starting point (lower class limit), let it be the
minimum observation. 6, 12, 18, 24, 30, 36 are the lower class limits.
Step 7: Find the upper class limit; e.g. the first upper class=12-U=12-
1=11 11, 17, 23, 29, 35, 41 are the upper class limits.
Girma.M.2021
So combining step 5 and step 6, one can construct the
following classes.
Class limits
6 – 11
12 – 17
18 – 23
24 – 29
30 – 35
36 – 41
Girma.M.2021
Step 8: find the class boundaries
- for class 1 :- Lower class boundary=6-U/2=5.5
-Upper class boundary =11+U/2=11.5
Find the rest class boundaries by subtracting U/2 units from the lower
limits and adding U/2 units from the upper limits.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5
Girma.M.2021
The complete frequency distribution follows
Girma.M.2021
Relative Frequency Distribution
Shows the relative concentration of items in a given class
interval
Relative frequency of a class =Class frequency/Total frequency
Example 2.4 : The following table shows the frequency
distribution of the Wages
wages(X)of 100 construction
Number of Works(f)
workers .
75-80 9
80-85 12
85-90 15
90-95 11
95-100 20
100-105 20
105-110 11
110-115 2
Total 100
75-80 9 0.09 9%
80-85 12 0.12 12%
85-90 15 0.15 15%
90-95 11 0.11 11%
95-100 20 0.2 20%
100-105 20 0.2 20%
105-110 11 0.11 11%
110-115 2 0.02 2%
Girma.M.2021
Cumulative Frequency Distribution
Girma.M.2021
• Example 2.5 : The table below shows the ‘less than’ cumulative
frequency distribution of marks of 70 students in a class.
solution
30-35 5 5
35-40 10 5+10=15
40-45 15 15+15=30
45-50 30 30+30=60
50-55 5 60+5=65
55-60 5 65+5=7
Girma.M.2021
• The above ‘less than’ cumulative frequency distribution can
also be written as follows
Marks Frequency
Less than 30
Less than 35 0
Less than 40 5
Less than 45 15
Less than 50 30
Less than 55 60
Less than 60 65
70
Girma.M.2021
• ‘More than’ cumulative frequency distribution of marks of 70
students
30-35 5 65+5=70
35-40 10 55+10=65
40-45 15 40+15=55
45-50 30 10+30=40
50-55 5 5+5=10
55-60 5 5
Girma.M.2021
• The above ‘more than’ C.F. Distribution can also be
expressed in the following form:
Girma.M.2021
DIAGRAMMATIC PRESENTATION OF DATA
BAR CHARTS
i. Simple bar chart
In simple bar charts, each bar represents one and only one figure.
A simple bar chart is usually constructed to represent total only.
60 56
50
50 45
Number of students
40
40
30
20
10
0
Math. Stat. Physics Chemistry
Department
Girma.M.2021
ii. Component (sub-divided) bar chart
The component bar chart gives the break up in parts which constitutes the
aggregate in a year place or sector.
Example : The table and chart below show the revenue, expenditure of a country
on education
Education Expenditure (in million)
Primary 60 80 40
Secondary 40 60 60
Higher Education 20 40 20
100
50
0
1978-80 1880-81 1981-82
Girma.M.2021
iii. Multiple Bar chart
• Here the interrelated component parts are shown n adjoining bars,
colored or marked differently, thus allowing comparison between
different parts.
• Example : The charts below show the revenue expenditure of a
country in education
Primary
90
Secondary
80
Higher Education
70
60
50
40
30
20
10
0
1978-80 1880-81 1981-82
Girma.M.2021
iV. Pie Chart
• A pie-chart is a circle divided by radical lines into sections
• Displaying quantities as percentage of a given total
• The total area of the pie represents 100 percent
• The sum of angles at a point being 360o
• Degrees = Amount*360o /Total
Example : The following table shows the monthly expense of family
with income of 1000 Birr.
Item Food Clothing Rent Others Total
Amount(in Birr) 400 200 250 150 1000
Clothing 200 72 20
Rent 250 90 25
Others 150 54 15
Girma.M.2021
Solution…
25%
20%
Girma.M.2021
GRAPHICAL PRESENTATION
i. Histogram
A histogram is a graphical display of the distribution of a data set
Look like a vertical bar graph, except that the columns touch each other.
LCBs are marked on x axis and the frequencies along the y- axis
according to a suitable scale
Example 2.10: Construct a histogram for the following FD
Girma.M.2021
Solution
10
8
Frequency
6
4
2
0
9.5 19.5 29.5 39.5 49.5
Class boundary
Girma.M.2021
ii. Frequency polygon
discrete variables or the class marks of classes are plotted against the
frequencies and these plotted points are joined together by straight lines
For a large number of classes, a frequency polygon is preferable.
10
8
Frequency
6
4
0
4.5 14.5 24.5 34.5 44.5 54.5 64.5
Class mark
Girma.M.2021
Ogive
Graphic representation of a cumulative frequency distribution.
Ogives are of two kinds.
i. Less than’ ogive: upper class boundaries are plotted against the
‘less than’ cumulative frequencies of the respective class & they are
joined by adjacent lines.
When these frequencies are plotted, we get a rising curve.
ii. More than’ ogive: lower class boundaries are plotted against
the ‘more than’ cumulative frequencies of their respective class
and they are joined by adjacent lines.
When these frequencies are plotted we get a declining curve
Girma.M.2021
Example : Using the following data Construct:
(a) the ‘Less than’ ogive and
(b) the ‘More than’ ogive for the above frequency distribution
10 _ 19 4
20 _ 29 5
30 _ 39 8
40_ 49 6
50 _ 59 2
Girma.M.2021
To construct both the less than and more than ogive , find both
the UCB and LCB, and less than cumulative and more than
cumulative frequencies
Class limits Frequency LCF MCF LCB UCB
Girma.M.2021
• The less than Ogive
Freq
uenc
y
25
20
15
10
Girma.M.2021
The more than ogive
Freq
uenc
y
25
20
15
10
Girma.M.2021
The End
Thank You!
51