0% found this document useful (0 votes)
32 views62 pages

Chapter 2 Statistical Concepts in Research

This document discusses key concepts in statistics and research including experimental error, types of variables, data structure and distribution, and criteria for choosing appropriate statistical tests. It covers systematic and random errors, qualitative and quantitative variables, parametric and non-parametric tests, and how to determine normality. The document stresses that the appropriate test depends on the type of data, distribution, research design, objectives, and hypotheses. Sample size and sampling techniques also influence statistical test selection.

Uploaded by

cathrineramos01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views62 pages

Chapter 2 Statistical Concepts in Research

This document discusses key concepts in statistics and research including experimental error, types of variables, data structure and distribution, and criteria for choosing appropriate statistical tests. It covers systematic and random errors, qualitative and quantitative variables, parametric and non-parametric tests, and how to determine normality. The document stresses that the appropriate test depends on the type of data, distribution, research design, objectives, and hypotheses. Sample size and sampling techniques also influence statistical test selection.

Uploaded by

cathrineramos01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Introduction to Concepts of

Statistics in Research cum


Basic Criteria in Deciding the
Appropriate Statistical Test
In an EXPERIMENT, there are …
EXPERIMENTAL ERROR: Variation produced by unknown factors beyond the control or not
considered by the experimenter (e.g. extraneous variables, confounding variables)

Systematic error: not calibrated


measuring tool; observer’s error
due to fatigue; wrong implementation of LESS ACCURATE
protocols; inappropriate design;
inappropriate statistical analysis, etc.

Random error: unpredictable


incidence of pests and diseases; LESS PRECISE
meteorological events; etc. which are
not included as factors
CRITERIA FOR APPROPRIATE DATA ANALYSIS

DATA STRUCTURE
• Type or Scale of Data/ Variable/
Measurement
• Data Distribution
Types of Variable/Data
According to Relationship

INDEPENDENT VARIABLE DEPENDENT VARIABLE


• The CAUSE • The EFFECT
• explicitly manipulated in the experiment or • the consequence of the cause (effect)
is the causal variable • Synonymous with response variable in
• Synonymous to predictor variables in regression
regression

a. Potato Varieties a. Growth and Yield Characters


b. Different Speed Level (rotation per min) b. Efficiency and Performance Measures
c. Product Formulations c. Sensory Evaluation and
d. Risk Factors of Obesity d. Risk Factors of Obesity
e. Features of ecological niches e. Features of ecological niches
Types of Variable/Data
According to Analysis
QUALITATIVE VARIABLE QUANTITATIVE VARIABLE
classification, categories, labels A true measurement

CATEGORICAL DISCRETE CONTINUOUS


still numerical but not "true" a finite number of values that can occur between theoretically infinite range of
measurements, rather points concerning the variable which is usually values, including integers (whole
categories or labels associated with count values/ data [whether true numbers) and decimal values
represented only by numbers measurements or categorical] which is usually a measurement
value
Example: sex/gender, class Example:
year, treatments, level of No. of insects, no. of tubers, etc. (if continuous) Example: Height, weight,
satisfaction, etc. Ratings for level of satisfaction, etc. (if categorical) temperature, etc
Level/ Scale of Measurements
RATIO
QUANTITATIVE True measurement and has
VARIABLE “true” zero value e.g. height, weight

CONTINUOUS, or INTERVAL
DISCRETE True measurements but has e.g. Temperature
NO "true zero" value

ORDINAL
QUALITATIVE categories that have a
e.g. level of satisfaction
VARIABLE particular order

CATEGORICAL or NOMINAL
DISCRETE categories that have NO e.g. sex/gender
particular order
Criteria in Deciding
the Statistical Test

MAIN TYPES has the assumption of


normal distribution
Parametric tests

Non-Parametric tests NOTE: Since most population


distribution approximately
resembles the normal
distribution, hence most
statistical tests being used
are parametric tests
Criteria in Deciding
the Statistical Test

MAIN TYPES
An Interval or ratio type of
Parametric tests data are continuous in
nature. Hence, it often
Non-Parametric tests resembles the normal
distribution.
Criteria in Deciding
the Statistical Test

MAIN TYPES
In an ordinal type of data,
Parametric tests although categorical in nature,
are sometimes treated with
Non-Parametric tests parametric tests.

Of course, given that the normal


distribution assumption are
tested/ proven.
Criteria in Deciding
the Statistical Test

In an experimental researches where


MAIN TYPES extraneous or nuance variables (e.g.
soil gradients) are controlled,
Parametric tests normality of the distribution are not
being tested. Except for ordinal
Non-Parametric tests variable of course.

In survey and observational


researches, however, the normality
must be tested first regardless if
the data is interval or ratio, before
indulging into parametric tests
Criteria in Deciding
the Statistical Test

Are tests for data that does not


MAIN TYPES follow the normal distribution
Parametric tests curve which is common for
categorical data type,
especially the nominal type.
Non-Parametric tests

An interval, ratio, {or


ordinal) type can also be
subjected to non-parametric
tests especially if the
assumption on normality was not
met.
Criteria in Deciding
the Statistical Test
GRAPHICALLY
HOW NORMALITY • Box plot
DISTRIBUTION IS • Normal Q-Q Plot
TESTED???? • Scatter Plot
• Histogram
TAKE NOTE: normality DESCRIPTIVELY
distribution assumption for • Skewness
a parametric test are either • Kurtosis
be on the observed data or INFERENTIALLY
on the residuals……. • Shapiro-Wilk test
• Kolmogorov-Smirnov
e.g. linear regression • Chi Square Goodness-of-Fit
and ANOVA • Bartlett Test
Criteria in Deciding
the Statistical Test

AS MENTIONED….. Appropriate tests depends on


the…..

The structure of your Type of Data


data would help you • Ratio
determine as to which • Interval
parametric tests and non- • Ordinal
parametric tests must be
used.
• Nominal

Distribution
• Normal Distribution
• Non Normal Distribution
Criteria in Deciding
the Statistical Test

MOREOVER…..
One must examine and be familiar on
the quantitative research design or
methodologies as these will guide you
on the appropriate statistical test.

Along with the research designs and


methods are, of course, the research
objectives and hypothesis statement.
Criteria in Deciding
the Statistical Test

REVIEWING…..
Data Collection:
RESEARCH DESIGN Longitudinal (repeated measure) vs.
and METHODS Cross-sectional

Purpose:
Descriptive vs. Exploratory vs.
Explanatory
Criteria in Deciding
the Statistical Test

REVIEWING…..
Data Analysis
Descriptive vs. Correlational vs.
OBJECTIVES Comparative

Purpose:
Descriptive vs. Exploratory vs.
Explanatory
Criteria in Deciding
the Statistical Test

REVIEWING…..
Significant Differences among
• one-sample group
HYPOTHESIS • two-sample groups
• kth-sample groups
is it…..
• independent (cross-sectional)
• dependent/ paired/
matched/ repeated (longitudinal)
Significant Association/
Relationship
Criteria in Deciding
the Statistical Test

Some statistical tests


REQUIRED SAMPLE require a minimum sample size
SIZE to be ROBUST.

Example:
Z-test can only be used when
number of observation/ sample
is 30 or above.
Thanks!!!!
Sampling Distribution,
Sample Size Estimation, and
Sampling Techniques
Systematic error: not calibrated
measuring tool; observer’s error
due to fatigue; wrong implementation of LESS ACCURATE
protocols; inappropriate design;
inappropriate statistical analysis, etc.

SAMPLING ERROR: statistical error that occurs when an experimenter or a researcher


from INADEQUATE and BIAS SELECTION of a sample that represents the entire population of
data

Random error: unpredictable


incidence of pests and diseases; LESS PRECISE
meteorological events; etc. which are
not included as factors,
SAMPLING
DISTRIBUTION

In most cases,
observing the entire
POPULATION is not
practical. Although,
it is the best.
SAMPLING
DISTRIBUTION

To understand the
population, we tend to
only observe “a part” of
the population which was
Sample formally called it as
SAMPLE
SAMPLING
DISTRIBUTION

Being the
representatives of
the Population

Sample We do inferences from


samples which aim to
generalize the population
the samples were drawn
from
SAMPLING
DISTRIBUTION

GENERALIZATION processes of the


sample to the population
involves…..
A. Estimation (Descriptive Statistics)
B. Hypothesis-testing (Inferential Statistics)
SAMPLING
DISTRIBUTION

The POPULATION has Poisson Distribution


features which were COUNT variables
called it as e.g. RBC Counts
parameters Binomial Distribution
PROPORTION Variables
e.g. Mortality Rate
Characterize the
DISTRIBUTION of
Approximated by NORMAL
the variables in a
DISTRIBUTION
population
SAMPLING
DISTRIBUTION
NORMAL DISTRIBUTION
(Gaussian)
ALMOST all population
are observed to be
approximately similar
with the bell-shaped
curve (given that the
variables are continuous)
SAMPLING
DISTRIBUTION
NORMAL DISTRIBUTION
(Gaussian)
SAMPLING
DISTRIBUTION
SAMPLE
POPULATION STATISTICS
PARAMETERS (Estimates)
µ (Arithmetic mean) ഥ
𝒙 (Arithmetic mean)
σ (Standard Deviation) s (Standard
Deviation)
NOTE: The parameters of the THUS, the
population with normal precision/closeness of the
distribution are being sample statistics to the
estimated from the sample. population parameter must be
ensured
SAMPLING
DISTRIBUTION

It is unlikely that sample


statistics be EQUAL to
population parameter.
There would always be a
difference

It is called [SAMPLING] ERROR.

Measured by Standard Error of the


Sample Mean
Estimate
Sample Proportion
SAMPLING
DISTRIBUTION
Standard Error of
the Mean (SEM) We often only rely on a single
Measures the sample and not all possible
dispersion of the samples in a given population.
sample means. Thus, SEM can be estimated with a
single sample.
SAMPLING
DISTRIBUTION
Confidence Interval
of the MEAN Known Population Variance

where:
an estimate (i.e. mean) can
1.96 for Z0.05 or 95% confidence level
be used to judge the 2.58 for Z0.01 or 99% confidence level
precision of the estimate of
the population mean. Unknown Population Variance
where:

α = 0.05 (95% confidence level)


or 0.01 (99%)
df = n - 1
It is the SEM
SAMPLING
DISTRIBUTION
Standard Error of
the Proportion We often only rely on a single
Measures the sample and not all possible
dispersion of the samples in a given population.
sample proportions. Thus, SE(p) can be estimated with
a single sample.

where:

p = sample proportion
n = sample size
SAMPLING
DISTRIBUTION
Confidence Interval
of the PROPORTION

an estimate (i.e.
proportion) can be used where:
to judge the precision 1.96 for Z0.05 or 95% confidence level
of the estimate of the 2.58 for Z0.01 or 99% confidence level
population proportion.
MEAN

PROPORTION
SAMPLING PLAN in WHY?????
the conceptualization
of you research is As mentioned, SAMPLING PLAN
indeed very important… affects the precision of
estimates and statistical
1. Estimation of the significance to the
required sample size population you are trying
to explain/ describe.
2. Technique in sampling
from the population
SAMPLE SIZE
ESTIMATION
the number of
participants in a
sample drawn from a
target population How can we be confidence
or assured that the
representatives (sample)
Take note that the estimates are sufficient/ enough/
and tested significance from adequate to detect
the sample technically statistical significance?
generalize the population it
was drawn from……..
SAMPLE SIZE
ESTIMATION

Sullivan (n.d)
Boston University

These are formulae of


estimating the sample size in
any experimental research
common in biostatistics
with the statistical analysis
in mind.

It integrates the confidence


interval (precision measure
of the sample estimates to
the population)
SAMPLE SIZE
ESTIMATION
In SURVEY researches…..

Cochran (1963)

For a large population Where:


(e.i. more than 1000) n0 = sample size
Z2 = desired confidence level
1.96 for Z0.05 or 95% confidence level
Actual number of the 2.58 for Z0.01 or 99% confidence level

target population is e = desired precision


not known (Infinite (generally the 10% of p or the
value
Population) 0.05)
p = proportion of the attribute
(e.g. incidence rate of a disease)
q = 1 – p
SAMPLE SIZE
ESTIMATION
In SURVEY researches…..
If the number of the
target population is
known (Finite
Population)

Where:
Adjusted n0 = sample size from infinite
Cochran (1963) population.
N = determine number of the target
population
SAMPLE SIZE
ESTIMATION
In SURVEY researches…..

Yaname’s Formula
(1967)
This formula resembles
that of the Slovin’s Where:
Formula. However, the N = determine number of the target
first publication of the population
e = margin of error (max. of 0.05/ 5%)
formula was authored by
Yamane hence the use of
Yamane’s instead of
Slovin’s.
SAMPING TECHNIQUES

AGAIN, part of the


sampling plan is the
techniques as to how
the samples were THIS is to ensure that the
collected/ selected. sample selected/ collected
CLOSELY represents the
distribution of the target
population, as much as
possible.
SAMPLING
TECHNIQUES

PROBABILITY Sampling
Techniques Are techniques of
which every
observation units/
NON-PROBABILITY respondents has an
Sampling Techniques EQUAL CHANCE of
being selected.
Process:
RANDOM SELECTION
SAMPLING
TECHNIQUES

often associated with case study


PROBABILITY Sampling
research design and qualitative
Techniques research where it tends to focus on
small samples and are intended to
examine a real life phenomenon and
not to make statistical inferences
NON-PROBABILITY in relation to the wider
Sampling Techniques population.
SOMETIMES……..
A quantitative research that
utilizes a non-probability sampling
technique are questioned as to its
use and validity of its statistical
inferential analysis (hypothesis-
testing)
SAMPLING
TECHNIQUES
PROBABILITY Sampling
Techniques list of names of the target
respondents/ participants

This requires a All crops planted in the


SAMPLING FRAME experimental area

The whole organ, in


particular
SAMPLING
TECHNIQUES
PROBABILITY

SIMPLE RANDOM SYSTEMATIC RANDOM


SAMPLING SAMPLING
1. Compute for K=N/n, NOTE: Sampling
Fishbowl Technique/
2. randomly select one frame must be
Lottery method/ Draw
from the first K, arranged in
Lots
3. Starting from selected order, either
one, add up K until the descending or
last of the sampling ascending.
frame
Simple Random Sampling Systematic Random Sampling
Probability Sampling
Simple Random
Sampling

Particular sampling units (for example,


locations and/or times) are selected
using random numbers, and all possible
selections of a given number of units are
equally likely or homogenous (no major
patterns or hot spots are expected).
Probability
SystematicSampling
or
Grid Sampling
Samples are taken at regularly spaced
intervals over space or time.

An initial location or time is chosen at


random, and then the remaining
sampling locations are defined so that
all locations are at regular intervals
over an area (grid) or time
(systematic). Grids include square, rectangular, triangular, or
radial grids.
SAMPLING
TECHNIQUES
PROBABILITY
STRATIFIED RANDOM CLUSTER SAMPLING
SAMPLING
Forming a relevant
Forming a relevant subgroup called
subgroup called CLUSTERS
STRATA
e.g. Based on
e.g. Based on geographical context
demographic profile (e.g.
(sex, age, etc.) municipalities)
Stratified Sampling Cluster Sampling
Probability
Stratified Sampling
Random Sampling
The target population is separated into non-
overlapping strata, or subpopulations that
are known or thought to be more homogeneous
(relative to the environmental medium or the
contaminant)

Strata may be chosen on the basis of spatial


or temporal proximity of the units, or on the
basis of preexisting information or
professional judgment about the site or
process.
(Adaptive)
Probability Sampling
Cluster Sampling
adaptive cluster sampling, n samples are taken
using simple random sampling, and additional samples
are taken at locations where measurements exceed
some threshold value.

Initial measurements are made of randomly selected


primary sampling units using simple random sampling.
Whenever a sampling unit is found to show a
characteristic of interest (for example, contaminant
concentration of concern, ecological effect as indicated
by the shaded areas in the figure), additional sampling
units adjacent to the original unit are selected, and
measurements are made.
SAMPLING
TECHNIQUES
PROBABILITY
MULTI-STAGE SAMPLING
TARGET POPULATION: Hog Raisers in
the Cordillera Region

This can be done Randomly select


Provinces
by repetitive/
Randomly select
stages using Municipalities of
STRATIFIED or selected Provinces
CLUSTER SAMPLING All Hog Raisers
in the selected
municipalities
Multi-Stage (Cluster)
NON - Probability Sampling
Composite Sampling
Volumes of material from several of the
selected sampling units are physically
combined and mixed in an effort to form a
single homogeneous sample, which is then
analyzed

Compositing is often used in conjunction


with other sampling designs when the goal is
to estimate the population mean and when
information on spatial or temporal variability
is not needed. It can also be used to
estimate the prevalence of a rare trait.
SAMPLING
TECHNIQUES
NON-PROBABILITY
QOUTA SAMPLING
Like cluster and
stratified. But….. Because, the proportion for
each strata/ cluster relative
to the sample must be equal to
The proportion of the the proportion of each strata/
strata or cluster of cluster in the population.
target population must be
known.
SAMPLING
TECHNIQUES
NON-PROBABILITY
QOUTA SAMPLING
Example: If there are.... Then, make sure that
out of the sample size,
40% Kankanaey there are…..
30% Ibaloy
20% Kalanguya 40% Kankanaey
10% others 30% Ibaloy
20% Kalanguya
In the Benguet 10% others
SAMPLING
TECHNIQUES
NON-PROBABILITY
SNOWBALL SAMPLING
A.K.A referral This can be used for rare
technique/ method/ or difficult access due
scheme to closed nature (e.g.
rape victims, HIV
positive, etc…)
SAMPLING
TECHNIQUES
NON-PROBABILITY
PURPOSIVE/ JUDGEMENT SAMPLING
This requires an This is specific description of
inclusion-exclusion the who or what to be included
criteria in the observation.

Whoever or whatever that


warrants merits to the research
objectives shall be included.
SAMPLING
TECHNIQUES
NON-PROBABILITY
CONVENIENCE SAMPLING
This technique is In survey, for
Many researches
often misunderstood instance, is simply
that utilized
as simple random “floating” the
convenience
sampling. questionnaires to
sampling, by
anyone esp. friends,
nature, reported
families, colleagues,
that simple
others that project
random sampling
the element of
was used.
“convenience”.
Thanks!!!!

You might also like