0% found this document useful (0 votes)
41 views

Selecting Samples: Lecture - 7

The document outlines different sampling techniques, including probability and non-probability sampling. It discusses key aspects of probability sampling such as identifying a suitable sampling frame, deciding on an appropriate sample size based on factors like confidence level and margin of error, and selecting sampling techniques like simple random sampling, systematic sampling, stratified sampling, and cluster sampling. Non-probability sampling techniques mentioned include quota sampling, purposive sampling, snowball sampling, self-selection sampling, and convenience sampling.

Uploaded by

Ahsan
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Selecting Samples: Lecture - 7

The document outlines different sampling techniques, including probability and non-probability sampling. It discusses key aspects of probability sampling such as identifying a suitable sampling frame, deciding on an appropriate sample size based on factors like confidence level and margin of error, and selecting sampling techniques like simple random sampling, systematic sampling, stratified sampling, and cluster sampling. Non-probability sampling techniques mentioned include quota sampling, purposive sampling, snowball sampling, self-selection sampling, and convenience sampling.

Uploaded by

Ahsan
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Lecture - 7

Selecting
Samples
» PROBABILITY AND NON-
PROBABILITY SAMPLING

» STAGES IN PROBABILITY SAMPLING

» CORRELATION AND ITS ESTIMATION

Lecture’s Outline
Two Major Sampling techniques
Probability (Representative) and Non-probability (Judgemental)

Figure 6.2 Sampling techniques


 Probability samples: in which members of the
population have a known and usually equal chance
(probability) of being selected
 i.e. it is possible to answer the research questions that
require you to make statistical inference about the
characteristics of the population

 Non-probability samples: instances in which the


chances (probability) of selecting members from
the population are unknown
 i.e. it is impossible to answer research questions that
require you to estimate statistically the characteristics of
population from the sample.

Probability and non-probability sampling


 Probability sampling requires a sampling frame,
when a sampling frame is not possible, non-probability
sampling is used.

 Sampling frame: Sampling frame is a complete list of all


cases in the population, from which sample will be drawn.

 Where no suitable list exists, researcher will have to


compile his/her own sampling frame.

 It is important to ensure that sampling frame is unbiased,


current and accurate.

Probability and non-probability sampling


1) Identify a suitable sampling frame based on
your research question(s) or objectives.
2) Decide on a suitable sample size.
3) Select the most suitable sampling technique
and select the sample.
4) Check that the sample is representative of
the population.

Stages in Probability Sampling


Sampling Frame is a complete list of all the cases in
the population from which the sample will be
drawn.
Checklist for selecting the SF:
 Does the SF include all cases?
 Are cases listed in the SF current?
 Does the SF exclude irrelevant cases, i.e. is it
precise.

Identify a suitable sampling frame


The choice of sample size is governed by:
 The confidence level or the level of certainty that
the characteristics of the data collected will
represent the characteristics of the population.
 The margin of error that you can tolerate or the
accuracy/precision you require.
 The types of analyses you are going to undertake
particularly the number of categories into which
data will be subdivided.
 Size of the sample population and distribution.

Decide on a suitable sample size


1. Rule of thumb: a minimum number of 30 in each category within
the overall sample. And where population in the category is less
than 30, take all population in that category into the sample.
2. Where population is higher than 10000: To ensure 95% level of
certainty (for population characteristics to be represented in
sample), use the following formula:
n = p% x q% x {z/(e%)}2
(See ‘worked example’ on next slide)
where
n = minimum sample size
p% = proportion belonging to the specified category
q% = proportion not belonging to the specified category
z = z value (z = 1.96 for 95% level of certainty)
e = margin of error (corresponding to z-value)
Continues on next slide

Decide on a suitable sample size


Question: If you are interested to study the performance of a
salesperson in a community of 4000 people, what should be
your sample size ?
Solution: According to the formula on the previous slide, you need
to know the proportion of population who received a visit from
the salesperson;
p% (proportion belonging to the specified category) and
q% (proportion not belonging to the specified category).
If you don’t know p% and q%, then you will have to carry out a pilot
survey, may be of 30 responses, to know the proportion of
responses who receive a visit by the salesperson per week. If
such a pilot survey reveals that 12 out of 30 clients receive
salesperson’s visit once a week, then this means 40 percent
belong to this category, and 60 percent do not; so

Next Slide ........

Decide on a suitable sample size


Now you know that; Levels of
q% = 40; confidence and
p% = 60 associated z
To ensure 95% level of certainty,
values
z = 1.96 and e = 5.
90% 1.65
Applying the formula for sample size ‘n’: 95% 1.96
n = 40 x 60 x {1.96/ }2 99% 2.57
5
= 2400 (0.392)2
= 2400 (0.154)
= 369.6 (Say sample size = 370)

Decide on a suitable sample size


3. Where population is less than 10000, an adjusted minimum
sample size n’ is used:
n
n 
  n 
 1   
  N 
where
n’ = adjusted minimum sample size
n = minimum sample size (calculated earlier)
N = Total population
Worked example: If population = 4000
n’ = 369.6 / {1 + (369.6/4000)}
= 369.6 / {1 + (0.092)}
= 369.6 / 1.092
= 338.46 (Say sample size = 339)
Decide on a suitable sample size
Continues from previous slide
4. Incorporating for non-response:
Common reasons for non-response:
a. Refusal to respond
b. Inability to respond
c. Inability to locate respondent
So, through a pilot/preliminary survey, it seems necessary to
estimate the response rate. If the response rate estimates at 30
percent, the ‘actual sample size’ abbreviated as na will be then:
na = (n/re) * 100
In our previous case, n = 369.60 (for more than 10000 population)
or n’ = 338.46 (for less than 10000 population); then:
na = (369.60/30) * 100 = 1232
or
na = (338.46/30) * 100 = 1128.2

Decide on a suitable sample size


Five main techniques used for a probability
sample

» Simple random
» Systematic
» Stratified random
» Cluster
» Multi-stage

Selecting a sampling technique


• Simple random sampling: the probability of being
selected is “known and equal” for all members of the
population
• Blind Draw Method (e.g. names “placed in a hat” and then
drawn randomly)
• Random Numbers Method (all items in the sampling frame
given numbers, numbers then drawn using table or computer
program)
• Advantages:
• Known and equal chance of selection
• Easy method when there is an electronic database
• Disadvantages: (Overcome with electronic database)
• Complete accounting of population needed
• Cumbersome to provide unique designations to every
population member

Probability Sampling Methods


Simple Random Sampling
• Systematic sampling: selecting a sample at regular
intervals from the sampling frame

Sampling Interval (SI) = N/ n


(n) = Actual sample size
(N) = Total population

• How to draw a systematic sample:


1) Calculate SI,
2) Randomly select a number between 1 and SI,
3) Go to this number as the starting point,
4) Select subsequent cases systematically using the
SI
Probability Sampling Methods: Systematic
Sampling
• Cluster sampling: method by which the population
is divided into groups (clusters), any of which can
be considered a representative sample.

• These clusters are mini-populations and therefore


are heterogeneous.
• Once clusters are established a random draw is
done to select one (or more) clusters to represent
the population.
• Area (next slide) and systematic sampling
(discussed earlier) are two common methods.

Probability Sampling Methods: Cluster


Sampling
• Drawing the area sample:

• Divide the geo area into sectors (subareas) and give


them names/numbers, determine how many sectors are
to be sampled (typically a judgment call), randomly
select these subareas. Do either a census or a systematic
draw within each area.

• To determine the total geo area estimate, add the counts


in the subareas together and multiply this number by
the ratio of the total number of subareas divided by
number of subareas.

Probability Sampling Methods: Cluster


Sampling – Area Method
» This method is used when the population
distribution of items is skewed. It allows us to
draw a more representative sample. Hence if there
are more of certain type of item in the population
the sample has more of this type and if there are
fewer of another type, there are fewer in the
sample.
» Stratified sampling: the population is separated
into homogeneous groups/segments/strata and a
sample is taken from each. The results are then
combined to get the picture of the total population.

Probability Sampling Methods


Stratified Sampling Method
» Quota sampling (larger populations)
» Purposive sampling
» Snowball sampling
» Self-selection sampling
» Convenience sampling

Non- probability sampling techniques


• Judgment samples: samples that require a
judgment or an “educated guess” on the part
of the interviewer as to who should
represent the population. Also, “judges”
(informed individuals) may be asked to
suggest who should be in the sample.
• Subjectivity enters in here, and certain
members of the population will have a
smaller or no chance of selection
compared to others

Non-probability Sampling Methods


Judgment Sampling Method
• Referral samples (snowball samples): samples
which require respondents to provide the names
of additional respondents
• Members of the population who are less
known, disliked, or whose opinions conflict
with the respondent have a low probability of
being selected.
• Quota samples: samples that set a specific
number of certain types of individuals to be
interviewed
• Often used to ensure that convenience samples
will have desired proportion of different
respondent classes
Non-probability Sampling Methods
Referral and Quota Sampling Methods
• Convenience samples: samples drawn at
the convenience of the interviewer. People
tend to make the selection at familiar
locations and to choose respondents who
are like themselves.
• Error occurs:
• 1) in the form of members of the
population who are infrequent or
nonusers of that location, and
• 2) who are not typical in the population

Non-probability Sampling Methods


Convenience Sampling Method
SPSS Exercise 7(a):
Correlation and its estimation
» If we are interested only in determining whether a
relationship exists, we employ correlation analysis.
Example: Student’s height and weight.
Plot of Height vs Weight Plot of Height vs Weight
7 7
6.6
6.6
6.2
Height

Height
5.8 6.2
5.4
5.8
5
4.6 5.4
100 140 180 220 260 100 140 180 220 260
Weight Weight
Plot of Height vs Weight Plot of Height vs Weight
6.8 6.6
6.5
6.2
Height

6.2
Height 5.8
5.9

5.6 5.4

5.3
100 140 180 220 260
5
100 140 180 220 260
Weight
Weight

Correlation Analysis… “-1 <  < 1”


Definition: Correlation between two variables X and Y is measured
through Correlation Coefficient ‘r’, which measures the degree or
strength of linear association between two variables, and is always
varied between -1 and +1, that is:
-1 < r < 1
» If the correlation coefficient is close to +1 that means you have a
strong positive relationship.
» If the correlation coefficient is close to -1 that means you have a
strong negative relationship.
» If the correlation coefficient is close to 0 that means you have no
correlation.
» WE HAVE THE ABILITY TO TEST THE HYPOTHESIS
H0:  = 0
» Warning:
˃ No proof of causality
˃ Cannot assume x causes y
» Sample statistics estimate Population parameters
• M tries to estimate μ
• r tries to estimate ρ (“rho” – greek symbol --- not “p”)
» r correlation for a sample
+ based on a the limited observations we have
» ρ actual correlation in population
+ the true correlation
» Beware Sampling Error!!
˃ even if ρ=0 (there’s no actual correlation), you might
get r =.08 or r = -.26 just by chance.

Samples vs. Populations


Let’s check if there is some correlation between CGPA obtained in your
latest degree and CGPAs obtained earlier at Bachelor, FA/F.Sc. and Matric
levels. Similarly, check correlation between latest degree’s CGPA and
father’s and mother’s year of schooling, and numbers of hours studied
per day.

1. First, click “ANALYZE”, then “CORRELATE”, then “BIVARIATE”.

2. Take the two variables of interest (CGPAL & CGPAB) from the left to the
right-side “VARIABLE” box, and click OK.

3. Note that:
1. rCGPAL, CGPAB = 0.473 = rCGPAB, CGPAL
2. CGPAL and CGPAB are correlated with estimated r = 0.473 at 0.01 level of
significance.
SPSS Exercise 7(a):
Correlation and its estimation

You might also like