0% found this document useful (0 votes)
23 views

RM - Cha 8

Uploaded by

kide93920
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

RM - Cha 8

Uploaded by

kide93920
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 195

UNITY UNIVERSITY:

FACULTY - COLLEGE OF BUSINESS, ECONOMICS


AND SOCIAL SCIENCE
COURSE TITLE: Research Methodology for
Economists
COURSE CODE: Econ 364
DEPARTMENT: Economics
LEVEL: Undergraduate
CREDIT HOURS: 3
ACADEMIC YEAR: 2023/2024 G.C
SEMESTER: I
COURSE INSTRUCTOR: H.A
E-mail:
SKYPE:
Cellphone: Unity University Dec 2023
CHAPTER: – EIGHT

Sampling Design and


Procedure
SAMPLING FUNDAMENTALS

• The statistical investigation can take 2 forms.

1. Census survey:

2. Sample survey:
1. Census survey:

The researcher:

• Studies every unit of the field of study/survey

• Drive conclusion by computing the sum of all

units
2. Sample survey:

The researcher:

• Study only a unit in the field of survey.

• Make generalizations or draw inferences based

on samples about the parameters of population

from which the samples are taken.


• Sample should be truly representative of
population characteristics without any
bias

• Representative: The sample should be


the mirror reflection of the target
population.
Basic Definitions Concerning Sampling
Population/Universe - N:

• Is the theoretically specified aggregation of survey


elements from which the survey sample is actually
selected.

• Can be:

I. Finite

II. Infinite
Finite Population:
• A population is called finite if it is possible to count or label
its individuals

• It may also be called a countable population

Ex:

➢ The number of people passing Megenagna bridge today

➢ The number of births per month in Addis Ababa

➢ The number of telephone calls you made with colleagues


during this week
Infinite Population:

• If it is not possible to count the units contained in


the population

• It may also be called a uncountable population

Ex:

➢The number of minerals found somewhere else

➢The number of germs in the body of a sick patient


Sampling Frame:

• Is the list of elements from which the sample is


drawn

Sample:

• A subset or some part of a larger population

Sample size - n:

• Is the number of units in the sample


Sample Design:

• Is a definite plan for obtaining a sample from sample


frame

Sampling:

• Is the process of using a small number or part of a


larger population to make conclusion about the
whole population.
Element/Sampling Unit:

• Is the basic unit from which information is collected

and which provides the basis of analysis

• Each and every member of the sample


Ex:

➢Head of the house is the sampling unit for the HH


survey

➢In the study to know the average age of Unity


University 4th year economics class, student is the
sampling unit
Statistic:

• Is a characteristic of a sample

Parameter:

• Is a characteristic of a population
Ex:

• When we work out certain measures such as

➢mean,

from samples are called


➢median,
statistic(s) for they describe

➢mode or the characteristics of a sample

➢the like ones


• But when:

➢ mean,

➢ median, describe the characteristics of

a population, they are known


➢Mode
as parameter(s)

➢Etc
• Population mean () is a parameter

• Whereas the sample mean X̄ ( x bar) is

statistics
Sampling error:

• Errors/inaccuracy which arise on account of


sampling to estimates around the true population
values

Non-sampling errors:

• Errors/inaccuracy arise due to various causes right


from the beginning stage when the survey is planned
and designed to the final stage where the data are
processed and analyzed
Precision:

• Is the range within which the population average (or

other parameter) will lie in accordance with the

reliability specified in the confidence level


Ex:

• If the estimate is ETB 4,000 and the precision


desired is ± 4%

➢ then the true value will be no less than ETB 3840

and no more than ETB 4160.

➢ this is the range (ETB 3840 to ETB 4160) within

which the true answer should lie.


Confidence level:

• The confidence level or reliability is the expected


percentage of times that the actual value will fall
within the stated precision limits.

• If we take a confidence level of 95%, then we mean


that there are 95 chances in 100 (or .95 in 1) that the
sample results represent the true condition of the
population within a specified precision range against
5 chances in 100 (or .05 in 1) that it does not.
Significance level:

• Indicates the likelihood that the answer will fall


outside that range.

• If the confidence level is 95%, then the significance


level will be (100 – 95) i.e., 5%

• If the confidence level is 99%, the significance level


is (100 – 99) i.e., 1%, and

• So on.
Sampling distribution:

• Refers to a probability distribution of a statistic that

comes from choosing random samples of a given

population.
SAMPLING DESIGN PROCESS
Sampling design Define Population . . . involved in
is the steps . . . sample planning
Census Vs Sample
Determine Sampling
Probability Sampling Procedure Non-probability
1. Simple Sampling
2. Systematic 1. Quota
3. Stratified 2. Snowball
4. Cluster 3. Convenience
5. Multi-stage 4. Judgmental
Determining appropriate
Sample Size
Estimate Cost of Planning

Execute Sampling Design


I. Defining the population

• Defining the target population implies specifying the

subject of the study.


II. Census Vs Sample

• Choice must be made between census and sample


Census: Complete enumeration
Advantages of census

Reliability:

• Data derived through census are highly reliable.

• The only possible errors can be due to computation

Detailed information:

• Census data yield much more information.


Limitation of census

• Expensiveness:

• Excessive time and energy:


Sample: Partial enumeration

Need for sampling

• For estimating; testing and making inference about a

population

• Save time and money.

• Enables researchers to obtain detailed study


Limitations of sampling technique

• Less accuracy:

• Misleading conclusion:

• Need for specialized knowledge:


❑Sampling technique is used under the following

conditions.

• Vast data:

• When at most accuracy is not required:

• Infinite population:

• When census is impossible:

• Homogeneity:
Essentials of an ideal sample

• 4 basic characteristics.

1) Representativeness:

2) Independence: Each unit should be free to be included

3) Adequacy: Sufficient units

4) Homogeneity:
III. Sample design - C

• Sample design is the heart of sample planning.

C
• Sample design should answer the following

1. What type of

sample to use?
C
2. What is the

appropriate

sample unit?
Sampling unit can be

a) Primary sampling unit:

• Units selected in the first stage of sampling

b) Secondary sampling unit:

• A unit selected in the second stage of sampling


Iv. Sample size determination - D

D
Iv. Sample size determination . . .

• The sample size determination is purely statistical

activity, which needs statistical knowledge.

• Size of the sample should be determined by a

researcher
Methods to Determine Sample Size:
1. Personal judgments: Subjective decision
2. Nature of universe: Homogenous or heterogenous in nature
3. Budgetary approach:
Ex:
• If cost of surveying of one individual or unit is 30
birr and if the total available fund for survey is say
1800 birr.
• Determine the sample size?
Answer:

• Sample size (n) = total budget of survey /Cost of

unit survey

• Accordingly: 1800 / 30 = 60 units

• The sample size will be 60 units


4. Traditional inferences:

• This is based on precision rate and confidence level.

• Information required:

a. The population size, if known

b. Variance of the population

c. The magnitude of acceptable error and

d. The confidence interval


a. Population size:
• How many people are you talking about in total?
• To find this out, you need to be clear about who does
and doesn’t fit into your study
• Ex: If you want to know about number of economists in
Ethiopia, you will include everyone who economics as
their major profession in Ethiopia
• If you are unable to calculate the exact number, it’s
common to have an unknown number or an estimated
range
b. Variance or heterogeneity of the population:
• It refers to the standard deviation of the population
parameter.

• How similar or different is the population?

• More variability equals more sample

• Less variability equals less sample

If you are not sure, you can start with 50%


variability
• According to the rule of the thumb standard
deviation is one-sixth of the range
Ex:
• If the HHs yearly average income is expected to
range between 1500br and 24,000br.
• Hence range (highest – lowest) = 24000-1500 =
22,500
• Using the rule of thumb the standard deviation will
be 1/6*(22,500) = 3750
C. Magnitude of acceptable error:

• An acceptable error for that study.

• The researcher makes subjective judgment.

• Ex: To estimate the average income of household

one may allow an error says  5%


d. Confidence interval:

• In most case (research) 95% confidence level is


used.

• That is, it is assumed that 95 times out of 100 the


estimate from sample will include the population
parameter.
• Sample size is determined based on:
a. The mean:
n = (Z.S/E)2 …..infinite population
n = (Z2.N. S2)/{[(N-1). E2]+ (Z2. S2)} …finite population
b. The proportion:
n = Z2.p.q /e2 …..infinite population
n = {Z2.p.q .N}/{(N-1)e2 + (Z2.p.q)} …finite population

Where
• n represent sample size
• Z represents standardization value indicating a confidence level
• E represents acceptable magnitude of error  an error factor
• S represents sample SD or an estimate of the population SD
• p and q (q = 1-p) are proportion/percentage
Z-scores for the most common confidence
levels

• The z-scores for the most common confidence levels

are given as below.

a. 90% – Z Score = 1.645

b. 95% – Z Score = 1.96

c. 99% – Z Score = 2.576


a). Sample size when estimating a mean:
n = (Z.S/E)2 . . . Infinite Population

n = (Z2.N. S2)/{[(N-1). E2]+ (Z2. S2)} . . . Finite


Population
Where,
N = size of population,
n = size of sample,
e = acceptable error (the precision)
S = standard deviation of population,
z = standard variate at a given confidence level
Ex 1:
• Sample size calculation using infinite population mean:
• The HH yearly income expected to range from 22,500br-
23,700br.
• Let we want to study the HH monthly expenditure on food
• We wish to have a 95% confidence level
• Acceptable range of error of no less than 20 birr
• And the estimated value of the SD (σ) is 200
Given: Solution:
• Z = 1.96 n = (ZS/E)2
= (1.96 *200/20)2
• E = 20
= 153,664/400 = 384.16
• S = 200 = 385
• If the range of error (E) is reduced to 10

• And given all others the same

• The sample size will increase a lot

n = (Z.S/E)2

= (1.96 *200/10)2

= 153,664/100

= 1536.64

= 1537
Note that:

Since n = (Z.S/E)2

• As standard deviation increases, sample size increases

and the reverse

• As sample size increases the margin of error (E)

decreases

• As E decreases sample size increases


µ = x̄ ± 1.96σ ÷ √ n
(b) Sample size when estimating a percentage or

proportion:

n = Z2.p.q/e2 …Infinite population

n = {Z2.p.q .N}/{(N-1)e2+(Z2.p.q)}…Finite population


Ex 1:

❑Sample size calculation using population proportion:

• Assume two candidates (A and B) running for election

• To estimate the next president's final approval rating,

how many people should be sampled where the

proportion of candidate A is 25% with the margin of

error 0.025 and 95% confidence level?


Solution:
Polls: Candidate A: 25% = 0.25
Candidate B: 75% = 0.75
• 95% confidence level (z = 1.96)
• Proportion p = 0.25
• Margin of error = 0.025
• Then the population proportion sample = n = Z2 p.q /e2
= 1.962*0.25*(1-0.25) / 0.0252
= 3.84*0.25*0.75/ 0.000625
= 0.7203/000625 = 1152.48 = 1153
V. Cost of Sampling
• The sample plan must take into account the
estimated cost of sampling.
• Such costs are of two types:
1. Overhead costs and
2. Variable costs
• In reality however, it may be difficult and even for
some people not reasonable to separate sampling
cost from over all study cost.
VI. Execution of sampling process

• The last step in sample planning is the execution of

the sample process (procedure).

• In short the sample is actually chosen.

• The actual requirement for sampling procedure


• Sample must be representative:
➢When it is a representative, a sample will be
relatively small pieces of the population that mirror
the various patterns and subclasses of the
population.
• Sample must be adequate:
➢A sample is adequate when it is of sufficient in size
to provide confidence in the stability of its
characteristics.
Sampling
Techniques
Sampling Techniques
Two types

Non- Probability
probability
sampling sampling

Sampling
Technique
s
1. Non-Probability
Sampling
Non-Probability Sampling
• Samples are selected based on the subjective
judgment of the researcher, rather than random
selection

• It does not give equal chance.

• Units are selected at the discretion of the researcher.

• Used when the representativeness of the population


is not the prime issue.
Advantages of non-probability sampling

• Less complicated

• Less expensive

• very convenient in the situation when the:

✓ sample to be selected is very small and

✓ the researcher wants to get some idea of the


population characteristics
Disadvantages of non-probability sampling:
• No confidence can be placed in the data
➢Don't represent the large population
• It depends exclusively on uncontrolled factors and
researcher's insight

• No statistical method to determine the margin of the


sampling errors.

• Based on an absolute frame, which does not


adequately cover the population
There are number of non-probability sampling:
1. Convenience /Accidental Sampling

• This is a "hit or miss" procedure of study.

• Attempts to obtain a sample of convenient elements


• Respondents
are selected b/s
they happen to
be in the right
place at the
right time
• When respondents are easily available (close to you)

➢ No planned effort is made to collect information.


Ex:

✓People in my class

✓People with some characteristics (Ex: bald)

• The availability and willingness to respond are the

major factors in selecting the respondents.

• Such a sample is taken to test ideas or even to gain

ideas about a subject of interest.


• This type of study is

➢ Good for exploratory study

➢ Not good for the conclusive (descriptive and

causal type of study)


Ex:

• Suppose you are researching part-time

instructors perception towards the university

of unity.

• Then you will select me as one of your

respondents.
2. Judgment/Purposive/Deliberate sampling
• Is a form of convenience sampling

• The population
elements are
selected based
on the
judgments of
the researcher.
• The researcher’s knowledge is instrumental in

creating a sample in this sampling technique

• Thus, there are chances

that the results obtained

will be highly accurate

with a minimum margin

of error.
• Select elements that are believed to be typical or

representative of the population.

• The researcher selects

a sample to serve a

specific purpose.

• A sample is less than fully representative.


Ex:

❖If I wanted to conduct research on UU and select staffs to


get right feedback about the university.

❖Sampling to select staffs: convenience sampling? or


purposive sampling? Ans: Purposive Sampling
Full Time Vs Part-Time
❖Selecting students for a competition like quiz to represent a
school or a class

❖The CPI is based on a judgment sampling.

➢ Based on prices of basket of G+S purchased by


average HHs.
• Its advantage is: • Weakness of this approach
✓ Highly subjective
✓ low cost,
✓ Generalization is not appropriate
✓convenient to use, ✓ Certain members of the
population will have a smaller
✓ less time- chance, or no chance of selection
compared to others
consuming, and ✓ Not representative since
favoritism is involved
✓as good as
✓ Its value is entirely depending
probability on the judgment of the
researcher
sampling
3. Snowball/Network Sampling
• Multiplicity sampling/Multi-stage Sampling

• Snowball: beginning small but becomes bigger and bigger


as it rolls downhill
• Popular among scholars conducting:

➢ observational research and

➢ in community study especially studies where


subjects are hard to locate.
• First:
➢Initial group of respondents are selected randomly
➢Those respondents are requested to provide the
names of additional respondents who belong to
the target population of interest
• This type of sampling technique works like a chain
referral

• Therefore, it is also called chain referral sampling


Ex:
• If the researcher wants to study on
➢ sex workers or
➢ drug abusers
❖It is a snowball sampling method that involves
the assistance of the study subjects to identify the
initial potential respondents and others through
them.
Advantage of Snowball Sampling:
• It substantially increases the probability of
finding the desired characteristic in the
population and lower sampling variance and
cost.

• Appropriate for small specialized population.

• Useful in studies involving respondents rare to


find.
Limitations of Snowball Sampling:
• It takes more time

• Most likely not representative

• Members of the population:

✓ who are little known,

✓disliked or

✓whose opinions conflict with the respondents,

have low probability of being included


4. Quota sampling

• The population is divided into different groups

• The interviewer assign quotas to each group

Quota Sampling
• The selection of individuals from each strata/group

is based on the judgment of the interviewer


• May be a two-stage sampling

First stage

• Consists of developing control


categories/quotas of a population category

Second stage:

• Sample elements are selected based on


convenience or judgment
Ex: Assume in the total sample to be taken from a given
population would be 1000. Of the total population may be: M
= 60%, F = 40% and Christian 50%, Muslim 45%, Judaism
3% and others 2%.

Category Population Sample


Gender
M 60% 600
F 40% 400
Religion
Christian 50% 500
Muslim 45% 450
Judaism 3% 30
Others 2% 20
Merits of quota sampling:

• The selection of the sample is quick, easy and

cheaper

• May control sample characteristics

• More chance of representative


Limitations of quota sampling:

• Selection bias

• The sample is not a true representative and

statistical properties cannot be applied


2. Probability
Sampling
Probability Sampling
• All probability samples are based on chance
selection procedures.
• Chance selection eliminates the bias inherent
in the non-probability sampling procedure.
• Why eliminates the bias?
➢B/s the process is random.

• The most preferred type of sampling


• Why most preferred type?
✓The sample units are not selected based on the
discretion of the researcher
✓Each unit of the population has some known
probability of entering the sample
✓The processes of sampling is automatic in one
or more steps of selection of units in the
sample
• There are 5 basic number of probability sampling

Probability Sampling

Simple Systematic
Random Stratified Cluster
Sampling
Sampling Sampling Sampling

Multi-stage
Sampling
1.Simple Random Sampling - SRS

• The basic sampling method in every statistical


computation.

• Each element in the population has an equal chance


of being included in the sample.

• The most straight forward of all methods

• The sampling process is simple because it requires


only one stage of sample selection.
How to Perform Simple Random Sampling?

Step 1: Define the population

Step 2: Decide on the sample size

Step 3: Randomly select your sample

― done in one of the two ways

I. Lottery method

II. Table of random numbers

Step 4: Collect data from your sample


I. Lottery method
➢ Choose sample by drawing from a hat or by using
a computer program
➢ Suppose that we have to select a random sample of
size n from a finite population of size N
➢ First assign numbers 1 to N to all the N units of the
population
➢ Then write numbers 1 to N on d/t identical slips or
cards so that a card is not distinguishable from
another
➢ They are folded and mixed up in a drum or a box
or a container
S1: Assign #s S2: Write #s 1 to N on
1 to N d/t identical cards

S3: Fold & add in S4: Blindfold


a container selection
➢ The required numbers of slips are selected for the
desired sample size
➢ The

selection

of items

thus

depends

on chance
➢ It can be made with replacement or without replacement

a. Simple random sampling without replacement


(SRSWOR):

• If the selected cards are not replaced before the next draw

b. Simple random sampling with replacement (SRSWR):

• If the selected cards are replaced before the next draw

• If the population size is large- it


is cumbersome

• Use table of random numbers


II. Table of random numbers

• The easiest way to select a sample randomly

• Assume we need to select a sample of 10

units from a finite population of size 100

units

• The methods/steps to be followed are:


S1: A list of all 100 units in the finite population is
prepared & each unit is assigned a serial number
ranging from 1 to 100

S2: Any number in the table is chosen at random and


it is the starting point for selecting the sampling units

S3: From the starting point we can make a move on to


the next number either vertically, horizontally or
diagonally

S4: Use the selected ones as your samples


• This process is continued until we reach 10 such random
numbers
Advantage/Merit of SRS:
• Lack of bias and prejudices

• More representative of the universe

• Simplicity

• Less knowledge required


Disadvantage/Demerit of SRS:
• Time needed to gather the full list of a population

• The bias that could occur when the sample set is not
large enough

• Not applicable when the units are heterogeneous in


nature

• If the population is heterogeneous go in for stratified


random sampling
2. Systematic Sampling

• Arranging the target population according to some


ordering scheme

• Select elements at regular intervals

• Random start

• And then proceeds with the selection of every nth


element from then onwards
How to Obtain the Sampling Interval?
• If the population contains N ordered elements

• And sample size of n is required to select

➢then we find the ratio of these two numbers

• Assume the Sampling Interval = n

n = population size/sample size

= N/n
Ex 1:
Ex 2: • Assume N = 9 • Select a random
• Want n = 3 number from 1 to 3
• Sampling Interval = N/n = 9/3 = 3 • Assume it is 3
Class Activity:

• Say the population size N = 700 and the desired

sample size is 70 (n = 70)

a. Indicate the sample interval?

b. Show your nth that the elements will be selected?


Answer:

a. The sample interval will be: 700/70 = 10

b. Random number at the 10 interval will be

selected (every 10th)

❖If the researcher starts from the 9th element then

19th, 29th, 39th, 49th, etc, elements will be selected.


• Systematic sampling assumes that the population

elements are ordered in the same fashion (like

names in the telephone directory).

• Systematic sampling may increase

representativeness when items are ordered with

regard to the characteristics of interest


Ex:

• If the population of customer group are ordered

by decreasing order of purchase volume, a

systematic sample will be sure to contain some

high-volume and some low-volume customers.


Merits:
• This method is simple and convenient
• Less time consuming (no need to number each
member of a sample)
Limitation:
• Since it is a quasi random sampling, the sample may
not be a representative sample.
• Applied only if the complete list of population is
available
3.Stratified Sampling

• A mixture of deliberate and random sampling

technique.

• Used if the population

does not constitute a

homogeneous group.
• The population is divided into various classes

or sub-population.

• Individually more

homogeneous than the

total population
• The heterogeneous population of
size N units is sub-divided
into L homogeneous non overlapping sub
populations called Strata
• The ith stratum having Ni units (i =1, 2, 3,…,L) such

that N1+N2 +…. +NL = N


• Sample size being ni from ith stratum (i =1,
2,…,L) is independently taken by simple
random sampling in such way that n1+n2+…
+nL = n.

• A sample obtained using this procedure is


called a stratified random sample.
•• What
What areare
the the common
common factors
factors in which thein population
which the is
grouped?
population is grouped? Now, Individually More Homogenous

Males Females

Heterogeneous Population

Children
• Age • Income • Religion
•• Gender
Age •• Race
Income •• Religion
Etc
• Gender • Race • Etc
Example
❑ Suppose a researcher wishes to collect information
regarding income expenditure of the male
population of, say Jimma Town.

❑1st we shall split the whole male population in the


town into various strata

❑This can be on the basis of special professions like:


✓Class of service giving people
✓Business men
✓Shopkeepers/sellers
✓Others
Total male population of Jima Town

Service providers

Businessmen

Sellers

Others
❑From these d/t groups, the researcher will

select
elements using
random
sample
technique.
• Questions to be considered in stratified sampling

I. How to form strata?

• Strata can be formed on the basis of common


characteristics.

• Strata are:
➢ purposively formed and
➢ usually based on past experience of the
researcher and
➢ personal judgment of the researcher
Tips:

• In stratified sampling method:

1. The number of variables should not be more than

two (good to be 1 or 2) - if more it is too complex

to manage

2. The number of strata shall be no more than six


II. How should items (elements) be selected from

each stratum?

• Simple random sampling.

• Systematic sampling can also be used if it is

considered more appropriate in certain situation.


III. How many items to be selected from each
stratum (sample size)?
• Proportionate to the relative size of that stratum.
• Ex: Suppose Pi the proportion of population
included in stratum i and n represents the total
sample size, the sample size of stratum i will then
be pi*n
• If we take equal size from each strata regardless of
the proportion, it is disproportionate sampling
Example:
• Suppose a company has the following staffs:
1. Full-time male worker…....90
2. Part-time male worker…....18
3. Full-time female worker…..27
4. Part-time female worker…..45
Total …………..………………180 staffs
• We are asked to take a sample of 40 staffs, stratified
according to the above categories
• The two approaches of strata size calculation

I. Proportionate stratified sampling

II. Disproportionate stratified sampling


I. Proportionate strata sample size calculation
• The 1st step is to calculate the percentage of each group
of the total
1. % Full-time male worker …..90/180 = 50%
2. % Part-time male worker …..18/180 = 10%
3. % Full-time female worker …..27/180 = 15%
4. % Part-time female worker …..45/180 = 25%
• This tells us that our sample of 40
1. Sample from full-time male worker = 50% *40 = 20
2. Sample from part-time male worker = 10% *40 = 4
3. Sample from full-time female = 15% *40 = 6
4. Sample from part-time female worker = 25% *40 = 10
II. Disproportionate strata sample size calculation

• Since 40 sample is expected to be taken, the

researcher need to take 10 from each strata

disproportionately
Merits:

• It provides a chance to study of all the sub-


populations separately.

• An optimum size of the sample can be determined


with a given cost, precision and reliability.

• It is a more precise sample

• Representation of subgroups in the population

• Biases reduced and greater precise


Limitations:

• There is a possibility of faulty stratification

and hence the accuracy may be lost

• Proportionate stratification requires accurate

information on the proportion of population in

each stratum.
4. Cluster sampling
• A method of probability sampling that is often used
to study:

➢large populations,

➢particularly those
that are widely
geographically
dispersed
• Researchers divide a population into smaller
multiple groups known as clusters

• Population divided into clusters of homogenous


units.
• This is done usually based on geographical
contiguity/ closeness.

• Primary sampling units are groups rather than


individual

• A sample of
such
clusters is
then
selected
• Groups like

➢ schools,

➢ manufacturing unit,
Then take a sample
➢ households,

➢ cities or from each clusters

➢block of city,

➢Etc
• After randomly selecting the primary sample
unit (city, part of city), we survey or interview
all families or elements in that selected
primary sample unit.

➢All units from the

selected cluster are

studied
• The area sample is the commonly used type of
cluster sampling.
How to cluster a sample?

• It is an example of two-stage sampling

• 1st stage: a sample of area is chosen

• 2nd stage: a sample of respondents within those

areas is selected
One - stage (1st stage) sampling:
• All of the elements within selected cluster are
included in the sample
• Has 4 steps
Step 1: Define your population
Step 2: Divide your sample into clusters
Step 3: Randomly select clusters to use as your sample
Step 4: Collect data from the sample
Two-stage (2nd stage) sampling:
• A subset of elements within selected clusters are
randomly selected for inclusion in the sample
Ex 1: Assume a researcher wants to evaluate consumer
spending on various modes of transportation in Bole of Addis
Ababa. Since Bole is a large sub-city with 14 woredas the
researcher needs to apply clustering approach to take a sample
out of it.
• The key stages he needs to apply for the application of
cluster sampling for this research would be:
Stage 1: Defining the total population:
✓ The target population is people living in Bole of Addis Ababa
Stage 2: Dividing population into 14 clusters:
✓ The area can be divided into 14 woredas or clusters
Stage 3: Choosing a sample of clusters out of the total:
✓ Ex: he may choose 3 out of the total 14 woredas
✓ HHs residing in 3 woredas will represent samples for the study
Stage 4: Choosing individual households to be included in the study
Ex 2:
• If you are interested in the average reading and
writing level of all the 5th graders in Addis Ababa
• Difficult to obtain a list of all 5th graders and collect
data from a random sample spread across Addis.
• So define the population:
➢ The 5th graders in Addis Ababa – select all schools
➢ Cluster the population – cluster the 5th graders by
school they attend
➢ Randomly select clusters to use as your sample -
sample within the school
➢ Collect data from the sample
Generally:

Choosing students from schools using cluster sampling

using one stage and two-stage samplings

➢ Select all schools – then sample within schools

➢ Sample Schools – then measure all students

➢ Sample schools – then sample students


Ex 3: Suppose a tourist guide in Lalibela of Ethiopia decided
to have research work on how the visitors are satisfied with his
overall management, visiting places and guiding capacity.

• He can manage just 7 people per a single tour visit

• He wants to survey its customers on a day he made 10 tours.


• Out of 10 tours he gives
on the same day, he
randomly selects 4 tour
visits and asks every
customer about their
feelings and experiences
Ex 4:
• Suppose we want to estimate the proportion of
machine-parts in an inventory, which are defective.
• Assume that there are about 20000 machine parts in
the inventory. They are stored in 400 cases of each
containing 50 parts each.
• Now using a cluster sampling, we would consider the
400 cases as clusters. From this cluster we randomly
select say n cases and examine all the machine-parts
in each randomly selected case.
Merit:

• Cluster sampling clearly will reduce costs by

concentrating survey in selected cluster.

Limitation:

• But it is less precise than random sampling

• Cluster sampling is used only because of the economic

advantage it possesses.
Cluster Sampling vs Stratified Sampling
BASIS FOR STRATIFIED CLUSTER
COMPARIS SAMPLING SAMPLING
ON
Meaning • Stratified sampling • Cluster sampling
is one, in which the refers to a sampling
population is method wherein the
divided into members of the
homogeneous population are
segments, and then selected at random,
the sample is from naturally
randomly taken occurring groups
from the segments. called 'cluster'.
Sample • Randomly selected • All the individuals
individuals are taken are taken from
Cluster Sampling vs Stratified Sampling
BASIS FOR STRATIFIED CLUSTER
COMPARISON SAMPLING SAMPLING

Selection of • Individually • Collectively


population
elements

Homogeneity • Within group • Between groups


Heterogeneity • Between groups • Within group
Bifurcation/Group • Imposed by the • Naturally occurring
researcher groups
Cluster Sampling vs Stratified Sampling

BASIS FOR STRATIFIED CLUSTER

COMPARISON SAMPLING SAMPLING

Objective • To increase • To reduce cost

precision and and improve

representation. efficiency.
What’s their Similarities?
SIMILARITY
STRATIFIED SAMPLING CLUSTER SAMPLING
• Probability sampling • Probability sampling methods
methods
• Divide a population into • Divide a population into distinct
distinct groups – Strata groups -Cluster
• Tend to be quicker and more • Tend to be quicker and more cost-
cost-effective ways of effective ways of obtaining a
obtaining a sample from a sample from a population compared
population compared to a to a simple random sample.
simple random sample.
• Between groups • Within group
• Imposed by the researcher • Naturally occurring groups
• To increase precision and • To reduce cost and improve
representation. efficiency.
What is their Differences?
Difference
STRATIFIED SAMPLING CLUSTER SAMPLING
• A probability sampling procedure in • Cluster Sampling is a sampling
which the population is separated into technique in which the units of the
different homogeneous segments population are randomly selected from
called ‘strata’, and then the sample is already existing groups called ‘cluster.’
chosen from the each stratum
randomly, is called Stratified
Sampling.
• The individuals are randomly selected • The sample is formed when all the
from all the strata, to constitute the individuals are taken from randomly
sample. selected clusters.
• Population elements are selected • Population elements are selected in
individually from each stratum. aggregates.
• There is homogeneity within the • Homogeneity is found between groups.
group.
• Heterogeneity occurs between groups. • The members of the group are
heterogeneous in cluster sampling.
• Categories are imposed by the • The categories are already existing
Stratified Versus Cluster Sampling
5.Multi-stage sampling

• A sampling method that divides the population into

groups (or clusters) for conducting research.

• It is a further improvement over cluster sampling.

• Also known as multi-stage cluster sampling.


• Items are selected from a population using smaller

and smaller groups (units) at each stage at

random.

• It's often used to collect data from a large,

geographically spread group of people in national

surveys.
Ex 1: If we wish to estimate say yield per hectare of a given
crop say coffee in Gedeo zone. We begin by random selection
of say 5 districts in the first instance.
250 Farms
• Of these 5 districts, 10
villages per district will be
chosen in the same manner. 5 farms/
Villages
• In final stage we will select
again randomly 5 farms
from every village.
• Thus, we shall examine per 10 Villages

hectare yield in a total of 250


5 Districts
farms all over that region.
Advantages of Multi-stage Sampling Technique

• It is easier to administer than most sampling


technique.

• A large number of units can be sampled for a given


cost because of sequential clustering, whereas this is
not possible in most sample design.

• It is relatively convenient, less time consuming and


less expensive method of sampling.
Disadvantage of Multi-stage Sampling Technique

• An element of sampling bias gets introduced


because of unequal size of some of the selected sub-
sample.

• This method is recommended only when it would be


practical to draw a sample with a simple random
sampling technique.
Choosing Non-probability Vs Probability
Sampling
Conditions favoring the use of
Factors Non-probability Probability
sampling Sampling
Nature of research Exploratory Conclusive
Relative magnitude of None-sampling Sampling
sampling and non-sampling errors are larger errors are
errors larger
Variability in the population Homogeneous Heterogeneo
(low variability) us (high)
Statistical considerations Unfavorable Favorable
Operational considerations Favorable Unfavorable
Sampling Error

and

Non-Sampling Error
1). Sampling Error
❖Is the difference between the result of a sample and
the result of census.

❖It is the difference between the sample estimation


and the actual value of the population.

❖The sample size is not equal to population size


except in the case complete enumeration
❖Although the sample is properly selected, there will
be some difference between the sample statistics and
the actual value (population parameter).

❖The mean of the sample might be different from the


population mean by chance alone.

❖The standard deviation of the sample might also be


different from the population standard deviation.
❑ Sampling error created:

➢Because of the chance only.

➢When you are working with representative samples

➢B/s it is the inevitable gap between your sample and

the truth population value


Ex:

❑To illustrate this let us take a very simple example:

• Suppose an individual student has scored the


following grades in 10 subjects.

• Consider these subjects as population

55, 60, 65, 90, 55, 75, 88, 45, 85, 82

• Let a sample of four grades 55, 65, 82, and 90 are


selected at random from this population
• The sample is taken to estimate the average grade of
this student.

• The mean of this sample is 73.

• But the population mean is 70.

• The sampling error is therefore, 73 - 70 = 3.

• However, the variation due to random fluctuation


(sampling error) decreases as the sample size
increases though it is not possible to completely
avoid sampling error.
2). Systematic Error /Non-sampling error

➢Systematic sampling is also called sampling bias.

➢It cannot be reduced or eliminate by increasing the


sample size.

➢Such error occurs because of human mistakes and


not chance variation.

➢There are factors contributing to it.


➢The possible factors contributing to the creation of

such error include:

I. Inappropriate sampling frame/sampling

procedure/sampling method

II. Accessibility bias

III. Non-response bias or defects in data collection

IV.Measurement error- defective measuring device


1. Inappropriate sampling:
• If the sample units are a misrepresentation of the population; it
will result in sample bias.
➢ Errors created from:
 sampling procedure/sampling method
 the way the survey is designed

• This could happen when a researcher gathers data from a


sample that was drawn from some favored locations.
• It occurs when there is a failure of all units in the population
to have some probability of being selected for the sample –
inadequate survey target population
2.Accessibility bias:

• In many research studies, researchers tend to select


respondents who are the most accessible to them.

• When all members of the population are not equally


accessible, the researcher must provide some
mechanism of controlling in order to ensure the
absence of over and under-representation of some
respondents.
3.Non-response bias:
• This is an incomplete coverage of sample or
inability to get complete response from all
individuals initially included in the sample.
• In some cases, respondents may intentionally give
false information in response to some sensitive
question
• For instance, people may not tell the truth of their
bad habit and income.
4. Measurement error

• Errors in measurement include:

➢Questions on age, income, and events that happened


in the past

➢The interviewer may also fail to record the


responses correctly

➢ Errors in coding, editing, and tabulation


❑ Maximizing accuracy requires that total study error

be minimized

❑ Total error = Sampling Error + Non-sampling

Error

❑ Total error is usually measured as total error

variance, also known as mean square (MSE)

(TE) 2 = (SE) 2 + (NE) 2


Generally,
❖ Non-sampling errors occur in a sample survey as well as in
census survey where as
❖ Sampling error occurs only in a sample survey.
❖ Preparing the survey questionnaire and handling the data
properly can minimize non-sampling error.
• If x bar is the sample mean
and μ is the population mean
of the characteristic X then
the sampling error is -x - μ
• The sampling error may be
+ve, -ve or 0
❖Non sampling errors are more serious than the
sampling errors

❖Why?

a. a sampling error can be minimized by taking a


large sample

b. it is difficult to minimize non sampling errors, even


if a large sample is taken
Sampling Theory
Sampling Theory
• Sampling theory is the study of the relationship
existing between a population and sample drawn
from the population.
• Sample theory is applicable only to random samples.
• The theory of sampling is concerned with estimating
the property of the population from those of the
samples and also with gauging the precision of the
estimate.
❖This sort of movement from particular (sample)

towards general (population) is what is known as

statistical induction or statistical inference.

❖ In simple word from the sample, we attempt to

draw inference concerning the population.


• In order to be able to follow this inductive method,

we first follow a deductive argument that is we

imagine a population and investigate the behavior of

the sample drawn from this population applying the

law of probability

• The methodology dealing with all this is known as

sampling theory.
• Sampling theory is design to attain one or more of
the following objectives:
✓Statistical estimation: Sampling theory helps in
estimating unknown population parameters from
knowledge of statistical measurement on sample
studies.
✓Testing of hypothesis: It enables us to decide
whether to accept or to reject the stated hypothesis.
• That is, observed differences are actually due to
chance or whether they are really significant
✓Statistical inference: Sampling theory helps in
making generalization about the population from the
studies based on samples drawn from it. It also helps
in determining the accuracy of such generalization
CHAPTER SUMMARY QUESTIONS
1. Systematic sampling is more random than a simple random
sampling method. A) True B) False.
2. Cluster sampling is one of the probability sampling
methods where individuals are randomly selected as the
primary sampling units finally. A) True B) False.
3. The first and last stages while conducting a census for the
research study is defining the target population and
execution of the plan respectively. A). True B). False
4. Sampling technique is more preferable to census when the
population is heterogenous. A). True B). False
5. The heterogeneous population of size N units is sub-
divided into H homogeneous non-overlapping sub
populations called Strata. A). True B). False
1. Why probability sampling are the most preferred?
A. It is very convenient in the situation when the sample to be
selected is very small and the researcher wants to get some
idea of the population characteristics
B. The sample units are selected based on the decision of the
researcher
C. Each unit of the population has some unknown probability
of entering the sample
D. The processes of sampling is automatic in which the
selection of one exclusively depends on the selection of the
other
E. All
F. None
2. Why is sampling so important to research?
A. Maximizing efficiency and being representative
B. It is usually impossible to study the entire population
C. Need for specialized knowledge
D. All E. Except C F. None
3. Basiliyos is worried that the d/c b/n the results of his
study and what is actually true for the population may be
too large. Which of the following would best define what is
Basiliyos worried about?
A. The randomness of his study
B. The sampling frame in his study
C. The population in his study
D. The error of his study E. All F. None
4. Which of the following is not true about non-probability
sampling
A. Samples are selected based on the subjective judgment
B. Random sample selection are not the intended use so that it
does not give equal chance
C. Sampling units are selected at the discretion of the
researcher
D. Used when the representativeness of the population is not
the prime issue
E. One person could have a 15% chance of being selected and
another person could have a 50% chance of being selected.
F. None
5. The list of elements from which the sample drawn is:

A). Element B). Sampling Unit C). Sampling frame

D). A and B

6. The basic unit that the researcher collects information

which provides the basis of analysis is ______________

A). Element B). Sampling Unit C). Sampling frame D). All
E). A and B F). None

7. Choose the one which is different

A). Simple B). Systematic C). Cluster D). Purposive E). All
8. Natanim wants to study the teaching-learning
methodology of graduating students under FBE
departments of Unity University in 2024. She thinks
using a stratified random sample would be the best
option for her study. Which of the following methods
would ensure Natanim takes a stratified random
sample
A. Draw a sample from each department of the faculty
B. Use lottery method
C. Employ table of random numbers
D. Order 4th year students by their average grade level and
choose every 50th participants
E. Choose economics department students from each section
of the year
F. None
9. The error due to the d/c b/n the sample estimation
and the actual value of the population is ________
A). Variance B). Systematic C). Sampling
D). Non-sampling
10. The most preferred type of sampling is ______
A). Convenience B). Judgmental C). Quota
D) Snowball E). All F). None
11.Secondary data should be verified in terms of _
A). Reliability B). Suitability C). Adequacy
D). Sufficiency E). All F) None
12. A hit or miss procedure of sample is:

A). Convenience B). Judgmental C). Quota

D). Snowball E). All F). None

13. A method of probability sampling that is often


used to study large populations, particularly those that
are widely geographically dispersed is

A). Strata B). Systematic C). Cluster D) Simple

E). All F). None


14. The type of representative sampling most preferred
for the heterogeneous and homogenous population is
A. Simple Random Sampling and Systematic Sampling
respectively
B. Systematic Sampling and SRS respectively
C. Stratified Random Sampling and cluster Sampling
respectively
D. Cluster Sampling and Stratified Sampling respectively
E. Stratified Random Sampling and SRS respectively
F. None
15. A population is divided into clusters and it has been
found that all the units within a cluster are same. In
this situation which sampling will be adopted?

A. SRSWOR

B. Stratified random sampling

C. Cluster sampling

D. Systematic sampling

E. Multi-stage sampling

F. None
16. Hiwot is trying to put a sample together for her
current study. Her advisor suggests that Hiwot use
probability sampling to create a sample. She agrees
and decides to use a simple random sample. Which
of the following methods could Hiwot use to ensure
that she is creating a simple random sample?
A. Choosing every 150th name in a sampling frame
B. Choosing people from d/t segments of a society
C. Looking at the subject’s profile and choosing the
best participant
D. Lottery Method
E. All
F. None
17. Which of the following explains the d/c b/n
probability and non-probability sampling?
A. In a probability sample, the participants are chosen through
an interview selection process while in a non-probability
sample the participants are chosen using random number
generator.
B. In a probability sample, the participants are chosen using a
convenient sample interview selection process while in a
non-probability sample the participants are chosen using
lottery method.
C. In a non-probability sample, the participants are chosen via
in a non-random selection process while in a probability
sample the participants are chosen randomly.
D. In a probability sample, the participants are chosen by
using judgment while in a non-probability sample the
participants are chosen using a computer program. E. All
18. As Soreti conducts her research, she becomes
concerned that her sample might not really represent the
population under study in the research. Why is this a
major concern for Soreti?
A. The research is biased if the sample is not truly
representative of the study population
B. The research cannot be statistically significant if the
sample is not truly representative
C. The research cannot be replicated if the sample is not truly
representative.
D. The research cannot be generalized to the wider population
if the sample is not truly representative.
E. All
F. None
19.Which of the following is NOT a step in the sampling
process?
A. Identify the population and sampling frame
B. Estimate the cost of planning
C. Determine the sample size
D. Determine the sampling procedure
E. Identify the sampling tool
F. None
20. I am one of the 5 million people who bought a lottery
ticket that rewards ETB 30 million, with the outcomes
supposed to be revealed on the eve of the upcoming
Christmas. What is my probability of winning the lottery?
A. 50% B. 0.00002% C. 0.6% D. 0.0000002% E. Can’t be
determined F. None
END OF CHAPTER EIGHT

THANK YOU!

You might also like