0% found this document useful (0 votes)
6 views

ES_Chapter 1-1

Chapter 1 introduces the basics of survey and statistics, covering topics such as the workflow of conducting a survey, differences between census and sample surveys, and various sampling methods. It emphasizes the importance of understanding population and sample characteristics, as well as the types of variables and data involved in statistical analysis. The chapter also outlines the advantages and disadvantages of different sampling techniques, including simple random, systematic, and stratified sampling.

Uploaded by

haidedental
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

ES_Chapter 1-1

Chapter 1 introduces the basics of survey and statistics, covering topics such as the workflow of conducting a survey, differences between census and sample surveys, and various sampling methods. It emphasizes the importance of understanding population and sample characteristics, as well as the types of variables and data involved in statistical analysis. The chapter also outlines the advantages and disadvantages of different sampling techniques, including simple random, systematic, and stratified sampling.

Uploaded by

haidedental
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Elementary Statistics

Chapter 1

1
Chapter 1 Introduction to Survey and Statistics

❑ Work flow of conducting a survey

❑ Census and Sample survey

❑ Sampling Methods
❑ Numerical variable vs categorical variable

❑ Presentation of numerical dataset

❑ Summarize the finding in a simple paragraph

❑ Linear function of a variable


2
Example 1:
Suppose now the manager of a large company with 3000 employees
wants to collect information about employees' satisfactory level
towards the company. How should the manager plan this survey?

Q: Why do we have to conduct this survey?


Q: Who are eligible to participate in the
study?
Q: How to measure the level of satisfactory?
Q: How many employees should be involved in
the study?
Q: How to summarize the collected data to
present the result?
3
Q: Why do we have to conduct this survey?
The objective: understand the employees' satisfactory
level towards the company
Q: Who are eligible to participate in the study?
Subject: every employee in the company
Q: How to measure the level of satisfactory?
Variable: 10 point scale: 10: very satisfied ... 0: very unsatisfied

Q: How many employees should be involved in the study?


Most ideal: conducting census: all employees (3000 employees)
or more convenient: conducting survey: portion of the employees
(e.g. 500 employees)
Q: How to summarize the collected data to present the result?
Summary statistics, e.g. mean, standard deviation, 25th percentile,
75th percentile, ... 4
Difference between Census and Survey

Sample
Population
(Some employees
(Every employee
in this company)
in this company)

5
Limitations to conduct Census

❑ Time ❑ Manpower
❑ Budget ❑ Location
❑Possibilities to obtain data

A sample would be selected based on a fair


and random procedure and then data will be
collected and analysed.

6
When doing census, data would be collected from every employee (3000
data would be collected). Once the data collection is completed, we try to
understand the current situation by doing some data analysis (for example,
calculation of mean and standard deviation of the scores). You can imagine
that the mean satisfactory score for example equals to 9.5 or 2.4 represent
very different situation.

When doing survey, we would also analysis the data in order to draw
conclusion about the current situation. However, as the data collection is
incomplete, we need to be very careful when we try to make the
conclusion. The reliability of the conclusion you made from a sample
survey very much depends on how good is your sample as a representative
of the population.

7
Key Concepts in Statistics

A Population is the totality of elements (also called items, objects) under


consideration. Investigation based on the data of the whole population is
called a census. Sometimes it is too expensive or impossible to obtain data
on every object of the population. In this case, we conduct survey by
selecting some objects of the population for analysis in order to derive the
characteristics of the whole population. A Sample is a portion of the
population that is selected for analysis.

8
• Every
elements in
a survey is • Only some of
known as the elements
the in a survey is
population. known as the
sample.

• A survey
which • A survey which
collects data collects data
from the from part of
whole the population
population is is called a
called a sample survey.
census.

9
Types of Survey Sampling Methods

Sampling methods are classified into two types:


1. Non-Probability Sampling

• The probability of the members being selected into the


sample is unknown.
• Where selection is made for convenience and time saving

2. Probability Sampling

• Each member in the population has a known probability of


being selected into the sample.
• Where selection is based on the chance of occurrence

10
Sampling Methods
Sampling

Unknown Known

Non-
probability Probability
Sampling Sampling

Simple
Random
Sampling

Systematic
Sampling

Stratified
Sampling
11
Probability Sampling
• When selecting probability sample, we need to ensure every element has a chance
to be selected.

• Sampling Frame is a data file that contains information of the


population objects
For examples: a telephone directory, student registration
list, employment record, etc.

12
Method 1: Simple Random Sampling
▪ Selects objects such that every object of the population has an equal chances
of being selected.

▪ Identify each element in the sampling frame let say with a unique identity
number and then sample can be selected by using a Random Number Table,
computer, etc.

1 2 3 4 5

6 7 8 9 10

Population 13
Example 1:

Select a sample of size 500 from 3000 employees by


simple random sampling method.

using a Random Number Table

14
Simple Random Sampling

Advantage: ____________________

Disadvantage: ______________________________

15
Method 2: Systematic Sampling

Systematic Sampling selects the first object a randomly and the


rest by a fixed interval k,

16
Example 1:
Select a sample of size 500 from 3000 employees by
systematic sampling method.
Steps:

1. Assign a unique number to each employee

2. Find the ratio k

3. Randomly select a starting number a

4. Select a, a + k, a + 2k, ……and so on, until 500 numbers are chosen

5. The corresponding employees with these numbers are chosen as


sample for the survey

17
Systematic Sampling

Advantage: ____________________

Disadvantage: ______________________________
_____________________________

18
Method 3: Stratified Sampling

▪ Stratified sampling divides the whole population into


distinct subgroups (called strata)

▪ Elements inside each strata share a common characteristic

▪ Individual samples are then selected from each of the strata


randomly.

▪ Strata are often sampled in proportion to their actual


percentages of occurrence in the overall population.

19
Stratified Sampling

20
Example 1:
Select a stratified sample of size 500 from 3000 employees, for
whom 600 are managers and the other 2400 are junior staffs.

Steps:

1. Find the sample size for each subgroup

Sample size for Managers:

Sample size for Junior staffs:

2. Draw 100 managers and 400 junior staffs randomly from each of the strata to
form a sample of 500 employees. 21
Stratified Sampling

Advantage: __________________________

Disadvantage: ___________________

22
Non-probability Samples

Select sample based on a convenience way (e.g.


street interview). Practical when no sampling
frame is available.

23
Example 2:
How does a sample of 500 teenagers to be selected in
order to review the satisfactory level towards a brand of
cola?

Solution:
As the population size is very large, all teenagers in
Hong Kong, it is impossible to prepare a sampling
frame. A more practical way is to invite 500 teenagers to
join the survey by convenience.

24
Variable
A variable is a characteristic of the individual to be
measured or observed.
e.g. age, weight, gender, nationality, income
Variables

25
Types of Data

26
Numerical variable: Data consists of numbers that represent counts or

measurements

Discrete: Data only takes place at particular values

(usually integers)

Continuous: Data covers a range of values

27
Categorical variable: Data consists of names that represent
categories
Nominal: No natural order between categories
Ordinal: There exists a natural order between
categories

28
Question

State whether each of the following questions provides numerical


or categorical data and indicate the types of data for each
question.

(a)What is your age?

(b)Which school are you studying at?

(c) What type of video games do you frequently play?

(d) How much do you spend on video games per month?


Example 3:
This is the result of part of the survey. How many variables
are there? What are the data types of the variables?

30
We usually use capital letter, e.g. X to denote the
variable and use small letter, x to denote the collected
data. Suppose let X represents the gender of an
employee, x1 = "F", x2 = "M", x3 = "F", x4 = "F", x5 = "M".
Sample size is usually denoted by n (n = 5) and
population size is denoted by N (N = 3000).

31

You might also like