0% found this document useful (0 votes)
9 views

Statistics CH-2

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Statistics CH-2

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 70

Debre Brehan University

College of Business & Economics


Department of Management

Statistics for Management-I

By Yilma E. (MBA)
CHAPTER-TWO
Data Collection
And
Presentation
MEANING OF Data
Data are the facts and figures collected, analyzed, and
summarized for presentation and interpretation.

All the data collected in a particular study are referred to


as the data set for the study.
Sampling Techniques
Researchers use samples to collect data and
information about a particular variable from a large
population.

Using samples saves time and money and in some


cases enables the researcher to get more detailed
information about a particular subject.
There are four basic methods of
sampling:
1. Random Sampling
2. Systematic Sampling
3. Stratified Sampling
4. Cluster Sampling.
Random Sampling: are selected by using chance
methods or random numbers.

 One such method is to number each subject in the


population.

 Example: Place the numbered cards in a bowl, mix


them thoroughly, and select as many cards as needed.
Systematic Sampling: Researchers obtain samples by
numbering each subject of the population and then selecting
every kth subject.

Example: Suppose there were 2000 subjects in the population


and a sample of 50 subjects were needed. Since 2000 ÷ 50 =
40, then k = 40, and every 40th subject would be selected
Stratified Sampling: Researchers obtain samples by dividing
the population into groups (called strata) according to some
characteristic that is important to the study, then sampling from
each group.
Samples within the strata should be randomly selected.
Example:
First year students
Second year students
Third year students
Cluster Sampling: Here the population is divided into
groups (called clusters) by some means such as
geographic area or schools in a large school district, etc.

Then the researcher randomly selects some of these


clusters and uses all members of the selected clusters as
the subjects of the samples.
Classification of Data
The word data means a set of known facts. There are
different types of data because there are different ways in
which facts are gathered.
Thus, data can be classified as either categorical
(Qualitative) or numerical (quantitative) .
 Data that can be grouped by specific categories
are referred to as categorical data.

 Categorical data use either the nominal or ordinal


scale of measurement.
Data that use numeric values to indicate how much or
how many are referred to as quantitative data.

Quantitative data are obtained using either the interval or


ratio scale of measurement.
Methods of Data Collection
The three of the most common methods of
data collection are:

1. The Telephone Survey


2. The Mailed Questionnaire
3. The Personal Interview.
1. Telephone Surveys
Advantage Disadvantage
Less costly or Some of the people in the
inexpensive. population will not have
phones

There is no face-to-face They will not answer when


contact. the calls are made.

People may have more The tone of the voice of the


interviewer might influence
freedom to their opinions.
the response of the person
2. Mailed Questionnaire Surveys
Advantage Disadvantage
Used to cover a wider It is more of time
geographic area consuming

Less costly or less Low number of responses


expensive to conduct and inappropriate answers to
questions
Respondents can remain
anonymous if they desire Some people may have
difficulty of reading or
understanding the questions.
3. Personal Interview Surveys
Advantage Disadvantage
Obtaining the more in- The interviewers must be
depth responses to trained in asking questions
questions. and recording responses.

Alleviate or eliminate the It is more costly than the


difficulty of reading. other two survey methods.

 The interviewer may be biased


in his or her selection of
respondents.
Methods of Data Presentation
The methods of data presentations are
categorized as:

1.Tabular Methods
2.Graphical Methods
1. Tabular Methods of Data Presentation

The most common methods of tabular


method includes:
I. Frequency Distributions,
Relative Frequency Distributions
Percent Frequency Distribution and
Cumulative Frequency Distributions.
A. Frequency Distribution
It is a tabular summary of data showing the number (frequency)
of items in each of several non-overlapping classes.
 Example:
Student 1 2 3 4 5 6 7 8 9 10

Grade B C B A D C A B D B

Using the given information, perform the frequency


distribution?
The two types of frequency distributions are the
categorical frequency distribution and the
grouped frequency distribution.

A. Categorical Frequency Distributions: Is used


for data that can be placed in specific categories,
such as nominal or ordinal level data.
Example
Given the following data set; Construct a frequency

distribution for the blood type.


Solution
Step-1: Make a table as shown.
Step-2: Tally the data and place the results in column B.
Step-3: Count the tallies and place the results in column C.
Step-4: Find the percentage of values in each class by using the
formula

 Where:
 f = frequency of the class and
 n = total number of values.
Step-5: Find the totals for columns C (frequency) and D
(percent). The completed table is shown.

 For the sample, more people have type O blood (36%) than any
other type.
B. Grouped Frequency Distributions: When the range of the
data is large, the data must be grouped into classes that are
more than one unit in width.
 Lower Class Limit: It represents the smallest data value
that can be included in the class.
 Upper Class Limit: It represents the largest data value that
can be included in the class.
 Class Boundaries: These numbers are used to separate the
classes so that there are no gaps in the frequency
distribution.
The basic rule of thumb is that:
 The class limits should have the same decimal place value as the data,
but;
 The class boundaries should have one additional place value and
ends with 5.

Find the boundaries by subtracting 0.5 from the lower


class limit and adding 0.5 to the upper class limit.

 Lower Boundary = Lower Limit – 0.5


 Upper Boundary = Upper Limit + 0.5
 Note: If the data are in tenths, such as 6.2, 7.8, and 12.6, the class
limits might be 7.8–8.8, and the class boundaries would be 7.75–
8.85. (i.e., by subtracting 0.05 from 7.8 and adding 0.05 to 8.8.

 The Class Width: Is found by subtracting the lower (or upper)


class limit of one class from the lower (or upper) class limit of the
next class.
 For example, assume 24 and 31 are the lower class limits of the
first and the second class respectively; the class width is 7 (31 - 24
= 7).
The Class Midpoint (Xm) is obtained by adding the

lower and upper boundaries and divided by 2, or adding the


lower and upper limits and divided by 2:

 Mathematically;
Or
Unique Properties of the Class

1. The classes must be mutually exclusive. Overlapping is


not allowed.

2. The classes must be continuous. Even if there are no


values in a class, the class must be included in the frequency
distribution.
3. The classes must be exhaustive. There should be enough
classes to accommodate all the data.
4. The classes must be equal in width. This avoids a
distorted view of the data. One exception occurs when a
distribution has a class that is open-ended.
Here are two examples of distributions with open-ended
classes.

Age is open-ended for the last class, while Minutes is


open-ended for the first class.
Example
These data represent the record of high temperatures in 0F
for each of the 50 states. Construct a grouped frequency
distribution for the data using 7 classes.
Solution
Step-1: Determine the range and class width.

 Find the range: Range = Highest Value - Lowest Value

Range = 134 - 100 = 34

Find the class width: Width = 5

Step-2: Determine the starting point for the lowest class limit

 This can be the smallest data value or any convenient number less than
the smallest data value.
 Add the width to the lowest score taken as the starting point to
get the lower limit of the next class (100, 105, 110, etc).

Step-3: Determine the class limit and class boundaries


 Find the class limits:

 Find the class boundaries: by subtracting 0.5 from each lower class limit
and adding 0.5 to each upper class limit: 99.5–104.5, 104.5–109.5, etc.
Step-4: Tally the data.
Step-5: Find the numerical frequencies from the tallies.

The completed frequency distribution is


B. Relative Frequency Distribution
The relative frequency of a class equals the fraction or
proportion of items belonging to a class.

For a data set with n observations, the relative frequency


of each class can be determined as follows:
C. Percent Frequency Distributions
The percent frequency of a class is the relative frequency
multiplied by 100.
Example: Given the following data; calculate the
relative frequency distribution and percent distribution?
Solution
SS
D. Cumulative Frequency Distribution
The Cumulative Distribution is a distribution that shows
the number of data values less than or equal to a specific
value (usually an upper boundary).
Example: Given the following data; calculate the
cumulative distribution?
Solution
S
Activity
The weights of the Top 50 Players; Listed are the weights
of the top 50 players. Construct a grouped frequency
distribution and a cumulative frequency distribution with 8
classes.
Graphic Methods of Data Presentation
The graphic methods are other methods for data
presentation.
The Histograms
The Frequency Polygons
The Ogives
The Pie-Chart
The Bar-Chart
The Histogram
A histogram is constructed by placing:

The variable of interest on the horizontal axis and

The frequency, relative frequency, or percent


frequency on the vertical axis.
The frequency, relative frequency, or percent frequency of
each class is shown by drawing a rectangle:

Whose base is determined by the class limits on the


horizontal axis and

Whose height is the corresponding frequency, relative


frequency, or percent frequency.
Example
Construct a histogram to represent the data shown for the
record high temperatures for each of the 50 states
Solution
Step-1: Draw and label the x and y axes.

Step-2: Represent the frequency on the y axis and the class


boundaries on the x axis.

Step-3: Using the frequencies as the heights, draw vertical


bars for each class.
 As the histogram shows, the class with the greatest number of
data values (18) is 109.5–114.5, followed by 13 for 114.5–119.5.
The Frequency Polygon
It is a graph that displays the data by using lines
that connect points plotted for the frequencies at the
midpoints of the classes.

The frequencies are represented by the heights of


the points.
Example
Using the following frequency distribution,
construct a frequency polygon.
Solution
Step-1: Find the midpoints of each class.

The midpoints are found by adding the upper and


lower boundaries and dividing by 2:

For instance;
The midpoints for each class are:
Step-2: Draw the x and y axis. Label the x axis with the
midpoint of each class, and then use a suitable scale on the y
axis for the frequencies.

Step-3: Using the midpoints for the x values and the


frequencies as the y values, plot the points.

Step-4: Connect adjacent points with line segments.


Nmmn
The Ogive
A graph of a cumulative distribution, called an Ogive.

It shows data values on the horizontal axis and either the
cumulative frequencies, the cumulative relative frequencies,
or the cumulative percent frequencies on the vertical axis.

The Ogive is constructed by plotting a point corresponding


to the cumulative frequency of each class.
This type of graph is called the cumulative frequency
graph, or Ogive.
The cumulative frequency is the sum of the frequencies
accumulated up to the upper boundary of a class in the
distribution.
The Ogive is a graph that represents the cumulative
frequencies for the classes in a frequency distribution.
Example
Using the following frequency distribution,
construct an Ogive.
Solution
Step-1: Find the cumulative frequency for each class.
Step-2: Draw the x and y axes. Label the x axis with the class
boundaries and represent the cumulative frequencies on the y-axis.

Step-3: Plot the cumulative frequency at each upper class


boundary.

Step-4: Starting with the first upper class boundary, 104.5,


connect adjacent points with line segments. Then extend the graph
to the first lower class boundary, 99.5, on the x axis.
Ngg
Cumulative frequency graphs are used to visually represent
how many values are below a certain upper class boundary.

For example, to find out how many record high


temperatures are less than 114.50F, locate 114.50F on the x
axis, draw a vertical line up until it intersects the graph, and
then draw a horizontal line at that point to the y axis.

The y axis value is 28. Thus, there are 28 records that are
below 114.50F
Mnn
Bar Charts
A bar chart is used for depicting categorical data summarized in
a frequency, relative frequency, or percent frequency distribution.

The graph represent classes (categories) on the horizontal axis (x-


axis)

A frequency, relative frequency, or percent frequency scale can be


represent on the vertical axis (y-axis).
Example
Example: Using the following information; construct the
bar chart.
Solution
Step-1: Calculate the relative frequency and percent
frequency distribution.
Step-2: Draw and label the x and y axes.

Step-3: Represent the class or categories on the x axis, and a


frequency, relative frequency, or percent frequency scale can be
represent on the y-axis

Step-3: Using the frequencies as the heights, draw vertical


bars for each class.
Pie Charts
The pie chart is a graphical device that presents relative
frequency and percent frequency distributions for categorical
data.
To construct a pie chart, first we draw a circle to represent all
the data.
Then we use the relative frequencies to subdivide the circle
into sectors, or parts, which correspond to the relative
frequency for each class.
 For example, because a circle contains 360 degrees and Coca Cola
shows a relative frequency of 0.38, the sector of the pie chart
labeled Coca Cola consists of 0.38(360) = 136.8 degrees. The sector
of the pie chart labeled Mirinda consists of 0.16(360) = 57.6
degrees, and so on.

The numerical values shown for each sector can


be frequencies, relative frequencies, or percent
frequencies.
PIE CHART OF SOFT DRINK PURCHASES
Class Activity
Using the given information, construct a histogram,
frequency polygon, and Ogive using relative frequencies
for the distribution (shown here) of the miles that 20
randomly selected runners ran during a given week.
For 108 randomly selected college applicants, the
following frequency distribution for entrance exam scores
was obtained. Construct a histogram, frequency polygon,
and Ogive for the data.
End of the Chapter

Thank You for Your


Attention

You might also like