0% found this document useful (0 votes)
68 views51 pages

Chapter 2-Stat I Ppt

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views51 pages

Chapter 2-Stat I Ppt

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Basic Statistics

Chapter Two
Data Collection & Presentation

BY: Girma.M(MBA)

By: Girma M
Data collection

Definition of data:

Data are the facts and figures collected and summarized for

Presentation, analyzing and interpretation.

All the data collected are referred to as the data set .

Data are raw materials for researches.

Girma.M.2021
Data collection...
Gathering of information systematically and in meaningful manner

for the accomplishment of the objectives.

The methods used in gathering the required information.

Care should be attached to the data collection process

 inaccurate and inadequate data, make the whole analysis is likely

to be faulty and misleading decisions

Girma.M.2021
Scope of the data collection

i. census or complete enumeration


• Census is a method of studying the whole population.
• The results are more representative, accurate and reliable
• An appropriate method of obtaining information on rare events.
• used as a basis for various surveys.
ii. Sample survey
Some of the advantages of sample survey are
• It reduces cost: It saves time, costs less etc
• Greater speed: takes less time for collection…
• Greater scope
• Greater Accuracy
• Apply when there is destructive nature of tests .

Girma,2021
Classification of data
Bases of classification
Qualitative Classification: when data are arranged according to
attributes like color, religion, marital-status, sex etc.
Quantitative Classification: Classification of data which can be
measured or counted in definite terms.
Geographical Classification: Data are arranged according to places
like continents, regions, and countries
Chronological Classification: arranged according to time like year..

Girma.M.2021
Types of data
Based on the nature of the data set or variables:
Qualitative Variables: are nonnumeric variables and can't be
measured. E.g. Gender, Religious, Attitude etc
Quantitative Variables: are numerical variables and can be
measured. E.g. Number of bedrooms in your house
Depending upon the sources utilized:
Primary Data
Secondary Data

Girma.M.2021
Primary data
It is the one, which is collected by the investigator or user himself
directly from the source.
first-hand information is gathered.
Such data is original in character and is generated by survey
conducted by concerned body.
Secondary data
Are those data which have been already collected & analyzed by
some earlier for its own use.
A source, which is not primary
used when it impossible to collect first-hand information .
Are data compiled from published and unpublished sources or files.
Girma.M.2021
Method of data collection

Methods of collecting primary data

Many authors commonly state three methods of collecting


primary data. These are:
Interview
Direct Observation
Questionnaire method

Girma.M.2021
Presentation of data
Presentation is a statistical procedure of arranging and putting data
in a form of Tables, Graphs and Diagrams.
The presentation of data is broadly classified in to the following
categories:
Tabular presentation
Diagrammatic Presentation and Graphic presentation.

Girma.M.2021
Tabular presentation of data

Tabulation is the process of putting classified data in the form


of a table.
A systematic arrangement of classified data in columns and
rows with classes and frequencies.
Thus, a statistical table makes it possible for the investigator to
present a huge mass of data in a detailed and orderly form.

Girma.M.2021
Advantages of tabulation

It simplifies complex data.

It presents facts in minimum possible space

Help to avoid unnecessary repetitions &explanation

It facilitates comparison of related facts.

It facilitates computation of various statistical measures like


averages, dispersion, correlation etc.
 Fasilitate data presentation in the form of graphs and diagrams.

Girma.M.2021
Frequency distribution
The number of times the values of a variable in a data occurs is
called the frequency .
A tabular representation of values of a variable together with the
corresponding frequencies is called a frequency distribution (FD)

TYPES OF FREQUENCY DISTRIBUTION


i. Basic types of frequency distributions (Absolute)
Categorical frequency distribution(CFD)

Ungrouped frequency distribution(UFD)

Grouped frequency distribution(GFD)

Girma.M.2021
Categorical FD
Procedure to construct CFD
Step 1: Make a table &categorize the data in into respective class.
Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Step 4: Find the percentages of values in each class by using; %=
f/ n * 100
Where f= frequency of the class, n=total number of value.
Step 5: Find the total for column (3) and (4).
NB: Percentages are not normally a part of frequency distribution

Girma.M.2021
Example
a social worker collected the following data on marital status for 25
persons : (M=married, S=single, W=widowed, D=divorced)

Girma.M.2021
By following the procedures we can come up with the following solution

Girma.M.2021
Ungrouped frequency distribution

• Shows a distribution where individual value of a variable


are linked with the respective frequencies separately.

• Is used to represent the variables that measured by ordinal,


Interval and Ratio scale.

Girma.M.2021
Constructing UFD
Steps in constructing a discrete ungrouped frequency distribution

1.make sure that the variable you have is discrete

2.Determine the possible values of the variable

3.Prepare three columns

Values Tally Marks Frequency

4. Write the possible values of the variable in ascending order in the first
column.

Girma.M.2021
Example
The following data represent the number of books read by 20 students.

Required: Construct a frequency distribution, which is ungrouped

Girma.M.2021
Solution

Mark Tally Frequency


60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1

Girma.M.2021
Grouped frequency distribution

• When the range of the data is large

Definitions of terms
Class limits: Separates one class in a grouped frequency distribution
from another.
Units of measurement (U): the distance between two possible
consecutive measures. It is usually taken as 1, 0.1, 0.01, 0.001
Class mark (Mid points): it is the average of the lower and upper
class limits or the average of upper and lower class boundary.

Girma.M.2021
Defnition of terms...
Class boundaries: Separates one class in a grouped frequency
distribution from another.
The boundaries have one more decimal places than the row data and
therefore do not appear in the data.
The lower class boundary is found by subtracting U/2 from the
corresponding lower class limit
The upper class boundary is found by adding U/2 to the
corresponding upper class limit.
Class width: the difference between the upper and lower class
boundaries of any class, difference between limits, the difference
between any two consecutive class marks

Girma.M.2021
Example
Construct a frequency distribution for the following data

Solutions:

Step 1: Crate array (arrange the data in ascending order)

Step 2: Find the highest and the lowest value H=39, L=6

Step 3: Find the range; R=H-L=39-6=33

Step 4: Compute the number of classes‟ desired using Sturges


formula; k=1+3.32 logn=1+3.32log (20) =5.32=6(rounding up)

Girma.M.2021
Solution…
Step 5: Find the class width; W=R/k=33/6=5.5=6 (rounding up)

Step 6: Select the starting point (lower class limit), let it be the
minimum observation. 6, 12, 18, 24, 30, 36 are the lower class limits.

Step 7: Find the upper class limit; e.g. the first upper class=12-U=12-
1=11 11, 17, 23, 29, 35, 41 are the upper class limits.

So combining step 5 and step 6, one can construct the following


classes.

Girma.M.2021
So combining step 5 and step 6, one can construct the
following classes.

Class limits
6 – 11
12 – 17
18 – 23
24 – 29
30 – 35
36 – 41

Girma.M.2021
Step 8: find the class boundaries
- for class 1 :- Lower class boundary=6-U/2=5.5
-Upper class boundary =11+U/2=11.5
Find the rest class boundaries by subtracting U/2 units from the lower
limits and adding U/2 units from the upper limits.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5

Step 9: Comput Class Marks (CM) = LCL+UCL/2 or LCB+UCB/2

Girma.M.2021
The complete frequency distribution follows

Class Limit Class Boundary Class Mark Frequency

6-11 5.5-11.5 8.5 2

12-17 11.5-17.5 14.5 2

18-23 17.5-23.5 20.5 7

24-29 23.5-29.5 26.5 4

30-35 29.5-35.5 32.5 3

36-41 35.5-41.5 38.5 2

Girma.M.2021
Relative Frequency Distribution
Shows the relative concentration of items in a given class
interval
Relative frequency of a class =Class frequency/Total frequency
Example 2.4 : The following table shows the frequency
distribution of the Wages
wages(X)of 100 construction
Number of Works(f)
workers .
75-80 9
80-85 12
85-90 15
90-95 11
95-100 20
100-105 20
105-110 11
110-115 2
Total 100

Required: Calculate the relative frequency of each class?


Girma.M.2021
Solution
Relative frequency
Number of Works
Wages (X)
(f) In decimals In %

75-80 9 0.09 9%
80-85 12 0.12 12%
85-90 15 0.15 15%
90-95 11 0.11 11%
95-100 20 0.2 20%
100-105 20 0.2 20%
105-110 11 0.11 11%
110-115 2 0.02 2%

Total 100 1.00 100%

Girma.M.2021
Cumulative Frequency Distribution

The cumulative frequency of value of a variable (a class) is


the sum of all the frequencies preceding or succeeding that
value (class) including the frequency of that value (class)
a) “Less than” cumulative frequency distribution
 The sum of all frequencies lying below the UCB of each
class.
b)‘More than’ cumulative frequency distribution
The sum of all frequencies lying above the LCB of each class.

Girma.M.2021
• Example 2.5 : The table below shows the ‘less than’ cumulative
frequency distribution of marks of 70 students in a class.
solution

Marks Frequency ‘Less than’ Cumulative Frequency

30-35 5 5
35-40 10 5+10=15
40-45 15 15+15=30
45-50 30 30+30=60
50-55 5 60+5=65
55-60 5 65+5=7

Girma.M.2021
• The above ‘less than’ cumulative frequency distribution can
also be written as follows

Marks Frequency
Less than 30
Less than 35 0
Less than 40 5
Less than 45 15
Less than 50 30
Less than 55 60
Less than 60 65
70

Girma.M.2021
• ‘More than’ cumulative frequency distribution of marks of 70
students

Marks Frequency ‘more than’ cumulative frequency

30-35 5 65+5=70
35-40 10 55+10=65
40-45 15 40+15=55
45-50 30 10+30=40
50-55 5 5+5=10
55-60 5 5

Girma.M.2021
• The above ‘more than’ C.F. Distribution can also be
expressed in the following form:

Marks Number of students


More than 30
70
More than 35
65
More than 40
55
More than 45
40
More than 50
10
More than 55
5
More than 60
0

Girma.M.2021
DIAGRAMMATIC PRESENTATION OF DATA

BAR CHARTS
i. Simple bar chart
In simple bar charts, each bar represents one and only one figure.
A simple bar chart is usually constructed to represent total only.

Example 2.6: The following table shows the number of student


attending in four departments
Department Mathemati Statistics Physics Chemistry
cs
Number of student 56 45 40 50

Required: Construct a simple bar chart for the above table.


Girma.M.2021
Solution:

60 56
50
50 45
Number of students

40
40

30

20

10

0
Math. Stat. Physics Chemistry

Department

Girma.M.2021
ii. Component (sub-divided) bar chart
The component bar chart gives the break up in parts which constitutes the
aggregate in a year place or sector.
Example : The table and chart below show the revenue, expenditure of a country
on education
Education Expenditure (in million)

1978-80 1980-81 1981-82

Primary 60 80 40

Secondary 40 60 60

Higher Education 20 40 20

Total 120 180 120

Required: construct component bar chart


Girma.M.2021
Solution
Primary
200
Secondary
Higher Education
150

100

50

0
1978-80 1880-81 1981-82

Girma.M.2021
iii. Multiple Bar chart
• Here the interrelated component parts are shown n adjoining bars,
colored or marked differently, thus allowing comparison between
different parts.
• Example : The charts below show the revenue expenditure of a
country in education
Primary
90
Secondary
80
Higher Education
70
60
50
40
30
20
10
0
1978-80 1880-81 1981-82

Girma.M.2021
iV. Pie Chart
• A pie-chart is a circle divided by radical lines into sections
• Displaying quantities as percentage of a given total
• The total area of the pie represents 100 percent
• The sum of angles at a point being 360o
• Degrees = Amount*360o /Total
Example : The following table shows the monthly expense of family
with income of 1000 Birr.
Item Food Clothing Rent Others Total
Amount(in Birr) 400 200 250 150 1000

Required: construct the pie chart?


Girma.M.2021
Solution
400/1000 *360 =144
Item Amount Degrees Amount (in percentages)
(in Birr) (Size of central
angle)

Food 400 144 40

Clothing 200 72 20

Rent 250 90 25

Others 150 54 15

Total 1000 360 100

Girma.M.2021
Solution…

Pie-chart for the above table


Food
Clothing
15% Rent
Others
40%

25%

20%

Girma.M.2021
GRAPHICAL PRESENTATION

i. Histogram
A histogram is a graphical display of the distribution of a data set
Look like a vertical bar graph, except that the columns touch each other.
LCBs are marked on x axis and the frequencies along the y- axis
according to a suitable scale
Example 2.10: Construct a histogram for the following FD

Class limits fi Class boundary


10-19 4 9.5-19.5
20-29 5 19.5-29.5
30-39 8 29.5-39.5
40-49 6 39.5-49.5
50-59 2 49.5-59.5

Girma.M.2021
Solution

10
8
Frequency

6
4
2
0
9.5 19.5 29.5 39.5 49.5
Class boundary

Girma.M.2021
ii. Frequency polygon

 A line chart of frequency distribution in which either the values of

discrete variables or the class marks of classes are plotted against the
frequencies and these plotted points are joined together by straight lines
For a large number of classes, a frequency polygon is preferable.

Example 2.11: Construct a frequency polygon for the above FD

Class limits Frequency Class Marks


10_19 4 14.5
20_29 5 24.5
30_39 8 34.5
40_49 6 44.5
50_59 2 54.5
Girma.M.2021
Solution

10

8
Frequency

6
4

0
4.5 14.5 24.5 34.5 44.5 54.5 64.5
Class mark

Girma.M.2021
Ogive
Graphic representation of a cumulative frequency distribution.
Ogives are of two kinds.
i. Less than’ ogive: upper class boundaries are plotted against the
‘less than’ cumulative frequencies of the respective class & they are
joined by adjacent lines.
When these frequencies are plotted, we get a rising curve.
ii. More than’ ogive: lower class boundaries are plotted against
the ‘more than’ cumulative frequencies of their respective class
and they are joined by adjacent lines.
When these frequencies are plotted we get a declining curve

Girma.M.2021
Example : Using the following data Construct:
(a) the ‘Less than’ ogive and
(b) the ‘More than’ ogive for the above frequency distribution

Class limits Frequency

10 _ 19 4

20 _ 29 5

30 _ 39 8

40_ 49 6

50 _ 59 2

Girma.M.2021
To construct both the less than and more than ogive , find both
the UCB and LCB, and less than cumulative and more than
cumulative frequencies
Class limits Frequency LCF MCF LCB UCB

10-19 4 4 25 9.5 19.5


20-29 5 9 21 19.5 29.5

30-39 8 17 16 29.5 39.5

40-49 6 23 8 39.5 49.5

50-59 2 25 2 49.5 59.5

Girma.M.2021
• The less than Ogive
Freq
uenc
y

25

20

15

10

9.5 19.5 29.5 39.5 49.5 59.5

Upper Class boundary

Girma.M.2021
The more than ogive

Freq
uenc
y

25

20

15

10

9.5 19.5 29.5 39.5 49.5 59.5

Lower Class boundary

Girma.M.2021
The End
Thank You!
51

You might also like