0% found this document useful (0 votes)
27 views

Science

Science is world
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Science

Science is world
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Basic Concept

of
Inferential Biostatistics
Hypothesis

A statistical hypothesis is an assumption about the distribution of


random variables on the basis of observed samples.

 In other words, a hypothesis is a statement about the parameter of the


population from which the sample may be drawn.

 A hypothesis may or may not be true.


Null Hypothesis and Alternative Hypothesis
Null Hypothesis (Ho ):
If the difference between the true value and the expected value of the
parameter is equal to zero, then the hypothesis is called null hypothesis.

 Say, θ = θo where θ is the true value of parameter and θo is the expected


value of the parameter.

 Null hypothesis is generally denoted by Ho and we write Ho: θ = θo .


Null Hypothesis and Alternative Hypothesis
Alternative Hypothesis (HA or H1):
 If the difference between the true value and the expected value of the
parameter is not equal to zero, then the hypothesis is called alternative
hypothesis.
 Any hypothesis which is accepted when the null hypothesis is rejected is
called alternative hypothesis. It is denoted by H1 or HA.
 In hypothesis testing procedure, we always test null hypothesis against the
alternative hypothesis.

Sometimes, it is not possible that the null hypothesis can be rejected, but it
does not mean that it is true, and it just states that we are unable to produce
enough evidence to reject stated null hypothesis. Thus, investigator must
have clearly defined null hypothesis in hand.
Null and Alternative Hypothesis
Let null hypothesis be,
Ho : θ = θo.
Then, the alternative hypothesis will be
H1: θ ≠ θo ( two tailed test)
OR
H1: θ > θo ( Right tailed test) { one tailed test }

H1: θ < θo ( left tailed test)


One tail and two tail inferential statistical test
• The choice of one-tail or two-tail test depends on the objective of the study.

• Research has a hypothesis that a new drug is more effective in reducing


the blood pressure; then the one-tail test could be sufficient to test the
hypothesis,

• but if not sure that the new drug may be more or less effective in lowering
BP as compared to existing drug then it is always better to use two-tail test.

• Inputs for the one-tail and two tail tests are same except the critical ratio (Z
value); which is different in one-tail test (Zα) and two-tail test (Zα/2).
One tailed and two tailed

Figure1 : Two tailed test diagram

Figure 2: One Tailed (Right and left tailed) Test Diagram


Type of Error
When the test statistics is applied to test the null hypothesis against the
alternative hypothesis, we may find two types of errors;
Type I error :
 The error committed in rejection of H o when Ho is true is known as type I
error.
 The probability of committing Type I error is known as the size of the test
or size of the critical region. And is denoted by 'α’ .
• Conventionally, 5% (α = 0.05) or 1% (α = 0.01) level of significance is
considered by the biomedical researchers; which means that research
accepts that there could be 5% or 1% probability that results observed are
due the chance not by our intervention.
• The resembling confidence levels (confidence interval [CI]) for the
appropriate level of significance are:
• (a) CI-95% for the 5% (α = 0.05) level of significance and
• (b) CI-99% for the 1% (α= 0.01) level of significance.
Type of Error

Type II error :
 The error committed in acceptance of H o when Ho is false is
known as type II error.

 The probability of committing Type II error is known as the


size of the Type II error. And is denoted by 'β' .
Study power
• It is a probability of generalization of study findings to population at
large.

• An increase in statistical power will decrease the possibility of Type II


error (β) occurrence; means reduces risk of false-negative results.

• Therefore, it is denoted as 1-β.

• In most of clinical trial the power of 0.8 (80%) or greater is considered


more appropriate to find out a statistically significant difference.

• Power of 80% means there are 20% chances that we may fail to
identify a significant difference even though it really exists.
Z value ( From Standard Normal Table)

Level of confidence (%) Two‐tailed (zα/2)


0.05 (95) 1.96
0.01 (99) 2.58
0.001 (99.9) 3.29
Power of the test Z value (Z1−β)
0.80 0.84
0.90 1.28
0.95 1.65
0.99 2.33
Critical region and Acceptance region

The region where the null hypothesis is rejected is known as


critical region. It is also known as region of rejection.

 Andthe region where the null hypothesis is accepted is


known as acceptance region.
Critical value(or Significant Value)

 The value of test statistics which separates the acceptance


region and the rejection region is known as the critical value.

 The critical value depends upon the level of significance used


and the alternative hypothesis which leads one tailed or two
tailed test.
Symbol of Critical Value
This is easy to remember:
– for a two-sided test, divide α by two and use z α/2

– for a one-sided test, use zα keeping in mind that the

rejection region consists of just one piece.


Random and Systematic Error
Random Error
 Random error is a chance difference between the observed and true
values of something (e.g., a researcher misreading a weighing scale
records an incorrect measurement).
 When you only have random error, if you measure the same thing
multiple times, your measurements will tend to cluster or vary around the
true value. Some values will be higher than the true score, while others
will be lower. When you average out these measurements, you’ll get very
close to the true score.
 For this reason, random error isn’t considered a big problem when you’re
collecting data from a large sample—the errors in different directions will
cancel each other out when you calculate descriptive statistics. But it
could affect the precision of your dataset when you have a small sample.
 Random error can be overcome by increasing the sample size.
Random and Systematic Error
Sources of Random errors
Some common sources of random error include:
 Natural variations in real world or experimental contexts.
 Imprecise or unreliable measurement instruments.
 Individual differences between participants or units.
 Poorly controlled experimental procedures.
Random and Systematic Error
Systematic Error
Systematic error is a consistent or proportional difference between the
observed and true values of something (e.g., a miscalibrated scale
consistently registers weights as higher than they actually are).
Systematic error is also referred to as bias because your data is skewed in
standardized ways that hide the true values. This may lead to inaccurate
conclusions.
If you have systematic error, your measurements will be biased away from
the true values. Ultimately, you might make a false positive or a false
negative conclusion (a Type I or II error) about the relationship between the
variables you’re studying.
Random and Systematic Error
Sources of systematic errors
 The sources of systematic error can range from your research materials
to your data collection procedures and to your analysis techniques.
 This isn’t an exhaustive list of systematic error sources, because they can
come from all aspects of research.
Random and Systematic Error
 Random error mainly affects precision, which is how reproducible the
same measurement is under equivalent circumstances.
 In contrast, systematic error(bias) affects the accuracy of a
measurement, or how close the observed value is to the true value.

 If you always underestimated or always overestimated, then that would


be a bias—however, your consistently under or overestimated
measurements would within themselves contain random error.

 Systematic errors are much more problematic than random errors


because they can skew your data to lead you to false conclusions.
Random and Systematic Error
Random error introduces variability between different measurements of
the same thing, while systematic error skews your measurement away
from the true value in a specific direction.
Basic Concept of Sample Size calculation
Underlying event rate in the population
• It is very essential to consider a prevalence rate or bottom line event rate of
the condition under study population while calculating for sample size.
• It is usually estimated from previous literature, including observational
cohorts.
• For example, studying the association of alcohol and liver disease, the
prevalence rate for liver disease in studying population should be known
before the study.
Margin of error
• It is a random sampling error, which is a likelihood of sample results
variation from the population.
• For example, suppose there is 40% prevalence of anemia study sample
and we set margin of error as 5%; it means that range of anemia in
population would be between 40 ± 5, i.e., 35% and 45% prevalence of
anemia.
Basic Concept of Sample Size Calculation
Standard deviation (SD) in the population
• A researcher must anticipate the population variance of given outcome
variable that may be calculated by mean that of SD.
• For homogenous population, smaller sample size will be needed as
variance or SD will be less in this population.
• For example, for studying the effect of exercise regimen on blood
glucose, we include a population with blood glucose ranging from 150 to
350 mg/dl.
• Now, it is simple to understand that we might require more number of
samples to find out differences among interventions because SD in this
group will be more.
• Although if we consider a sample from population with blood sugar
reading in between 150 and 250 mg/dl, then researcher may receive a
more similar group representing homogeneity, therefore, decreases SD
and number of samples for study.
Sample Size Calculation
Estimation of Sample size for Cross- Sectional or Descriptive
Research Studies :
Case I : Sample size in case data is on nominal/ordinal scale and proportion
is one of the parameters
Sample size (n) =
Where,
• n = Desired sample size
• Zα/2 = Critical value and a standard value for the corresponding level of
confidence. (At 95% CI or 5% level of significance (type-I error) it is 1.96
and at 99% CI it is 2.58)
• P = Expected prevalence or based on previous research and q = 1-p
• d = Margin of error or precision
Sample Size Calculation
• Example : A researcher wants to carry out a descriptive study to
understand the prevalence or proportion of diabetes mellitus among
adults in a city. A previous study stated that diabetes in the adult
population was 10%. At 95% CI and 2% margin of error, calculate the
sample required to conduct other new research?
Solution:
Here, Prevalence(P) = 10% = 0.1 ⇒ 1- P = 0.9
Confidence Interval ( 1- 𝛂 ) = 95% ⇒ 𝛂 = 5% = 0.05
Critical Value () = 1.96 [ from SNT]
Margin of Error (d) = 2% = 0.02
Now,
Sample size (n) = = = 864.3 ≈ 865
Sample Size Calculation
Case II : Sample size, when mean is of the study or data are on
interval/ratio scale
Sample size (n) =
Where,
• n = Desired number of samples
• zα/2 = Standardized value for the corresponding level of confidence.
• d = Margin of error or rate of precision
• σ = SD which is based on previous study or pilot study

Sample size for finite Population:


Where, N = Population Size
= Sample size calculated from the above formula.
Sample Size Calculation
Example : Suppose a researcher wants to know the average hemoglobin
level among adults in the city at 95% CI and the margin of error is 2 g/dl.
From a previous study, the SD of hemoglobin level among adults was
found to be 4.5 g/dl. How many study subjects will be required to conduct
a new study?
Solution:
Standard deviation ( ) = 4.5 g/dl
Confidence Interval ( 1- 𝛂 ) = 95% ⇒ 𝛂 = 5% = 0.05
Critical Value () = 1.96 [ from SNT]
Margin of Error (d) = 2 g/dl
Now,
Sample size (n) = = = 19.44 ≈ 20.

You might also like