ES_Chapter 1-1
ES_Chapter 1-1
Chapter 1
1
Chapter 1 Introduction to Survey and Statistics
❑ Sampling Methods
❑ Numerical variable vs categorical variable
Sample
Population
(Some employees
(Every employee
in this company)
in this company)
5
Limitations to conduct Census
❑ Time ❑ Manpower
❑ Budget ❑ Location
❑Possibilities to obtain data
6
When doing census, data would be collected from every employee (3000
data would be collected). Once the data collection is completed, we try to
understand the current situation by doing some data analysis (for example,
calculation of mean and standard deviation of the scores). You can imagine
that the mean satisfactory score for example equals to 9.5 or 2.4 represent
very different situation.
When doing survey, we would also analysis the data in order to draw
conclusion about the current situation. However, as the data collection is
incomplete, we need to be very careful when we try to make the
conclusion. The reliability of the conclusion you made from a sample
survey very much depends on how good is your sample as a representative
of the population.
7
Key Concepts in Statistics
8
• Every
elements in
a survey is • Only some of
known as the elements
the in a survey is
population. known as the
sample.
• A survey
which • A survey which
collects data collects data
from the from part of
whole the population
population is is called a
called a sample survey.
census.
9
Types of Survey Sampling Methods
2. Probability Sampling
10
Sampling Methods
Sampling
Unknown Known
Non-
probability Probability
Sampling Sampling
Simple
Random
Sampling
Systematic
Sampling
Stratified
Sampling
11
Probability Sampling
• When selecting probability sample, we need to ensure every element has a chance
to be selected.
12
Method 1: Simple Random Sampling
▪ Selects objects such that every object of the population has an equal chances
of being selected.
▪ Identify each element in the sampling frame let say with a unique identity
number and then sample can be selected by using a Random Number Table,
computer, etc.
1 2 3 4 5
6 7 8 9 10
Population 13
Example 1:
14
Simple Random Sampling
Advantage: ____________________
Disadvantage: ______________________________
15
Method 2: Systematic Sampling
16
Example 1:
Select a sample of size 500 from 3000 employees by
systematic sampling method.
Steps:
17
Systematic Sampling
Advantage: ____________________
Disadvantage: ______________________________
_____________________________
18
Method 3: Stratified Sampling
19
Stratified Sampling
20
Example 1:
Select a stratified sample of size 500 from 3000 employees, for
whom 600 are managers and the other 2400 are junior staffs.
Steps:
2. Draw 100 managers and 400 junior staffs randomly from each of the strata to
form a sample of 500 employees. 21
Stratified Sampling
Advantage: __________________________
Disadvantage: ___________________
22
Non-probability Samples
23
Example 2:
How does a sample of 500 teenagers to be selected in
order to review the satisfactory level towards a brand of
cola?
Solution:
As the population size is very large, all teenagers in
Hong Kong, it is impossible to prepare a sampling
frame. A more practical way is to invite 500 teenagers to
join the survey by convenience.
24
Variable
A variable is a characteristic of the individual to be
measured or observed.
e.g. age, weight, gender, nationality, income
Variables
25
Types of Data
26
Numerical variable: Data consists of numbers that represent counts or
measurements
(usually integers)
27
Categorical variable: Data consists of names that represent
categories
Nominal: No natural order between categories
Ordinal: There exists a natural order between
categories
28
Question
30
We usually use capital letter, e.g. X to denote the
variable and use small letter, x to denote the collected
data. Suppose let X represents the gender of an
employee, x1 = "F", x2 = "M", x3 = "F", x4 = "F", x5 = "M".
Sample size is usually denoted by n (n = 5) and
population size is denoted by N (N = 3000).
31