0% found this document useful (0 votes)
7 views5 pages

BIOSTATISTICS SUMMARY

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views5 pages

BIOSTATISTICS SUMMARY

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

BIOSTATISTICS SUMMARY

TYPES OF DATA:
1. Qualitative Data: These are words or attributes indicating to which category an element
belong. There are two divisions of qualitative data:
• Nominal: can not be ordered, and Uses names, labels, or symbols to assign each
measurement to one of a limited number of Categories that cannot be ordered. Eg.
Sex, race, blood group etc.
• Ordinal: can be numbered, and assigns each measurement to one of a limited
number of categories that are ranked in terms of a graded order. Eg. Quality of
service, socio-economic status, degree of malnutrition etc.
2. Quantitative Data: These are type of data that can be measured. Also divided into two:
• Continuous: height, weight, blood pressure
• Discrete: values are integers: number of siblings, the number of times a person
has been admitted to a hospital, family size
SOURCES OF DATA:
1. Primary data: collected from the items or individual respondents directly for the
purpose of certain study.
2. Secondary data: which had been collected by certain people or agency, and statistically
treated and the information contained in it is used for other purpose.
VARIABLE
A variable is a characteristic that can be measured and that can assume different values. Variables
may be classified into two main categories: Categorical and Numeric.
1. Categorical Variables: nominal or ordinal for categorical variables.
2. Numeric Variables: discrete or continuous for numeric variables.
The two key variables that are always present are the Independent and Dependent variable.
The independent variable is the one that the researcher intentionally changes or controls.
The dependent variable is the factor that the research measures. It changes in response to the
independent variable or depends upon it.
For example, a scientist wants to see if the brightness of light has any effect on a moth being
attracted to the light. The brightness of the light is controlled by the scientist. This would be the
independent variable. How the moth reacts to the different light levels (distance to light
source) would be the dependent variable.
SAMPLING METHODS
Sample is the group of individuals who will actually participate in the research. There are two
types of sampling methods:
1. Probability sampling involves random selection, allowing you to make strong statistical
inferences about the whole group. means that every member of the population has a
chance of being selected. There are four main types of probability sample:
• Simple random sampling: In a simple random sample, every member of the
population has an equal chance of being selected. Your sampling frame should
include the whole population. To conduct this type of sampling, you can use tools
like random number generators or other techniques that are based entirely on
chance.
• Systematic sampling: Systematic sampling is similar to simple random sampling,
but it is usually slightly easier to conduct. Every member of the population is
listed with a number, but instead of randomly generating numbers, individuals are
chosen at regular intervals.
• Stratified sampling: To use this sampling method, you divide the population into
subgroups (called strata) based on the relevant characteristic (e.g. gender, age
range, income bracket, job role). Based on the overall proportions of the
population, you calculate how many people should be sampled from each
subgroup. Then you use random or systematic sampling to select a sample from
each subgroup.
• Cluster sampling: Cluster sampling also involves dividing the population into
subgroups, but each subgroup should have similar characteristics to the whole
sample. Instead of sampling individuals from each subgroup, you randomly select
entire subgroups.
2. Non-probability sampling involves non-random selection based on convenience or
other criteria, allowing you to easily collect data. In a non-probability sample, individuals
are selected based on non-random criteria, and not every individual has a chance of being
included.
• Convenience sampling: A convenience sample simply includes the individuals
who happen to be most accessible to the researcher. This is an easy and
inexpensive way to gather initial data, but there is no way to tell if the sample is
representative of the population, so it can’t produce generalizable results.
• Voluntary response sampling: Similar to a convenience sample, a voluntary
response sample is mainly based on ease of access. Instead of the researcher
choosing participants and directly contacting them, people volunteer themselves
(e.g. by responding to a public online survey).
• Purposive sampling: This type of sampling, also known as judgement sampling,
involves the researcher using their expertise to select a sample that is most useful
to the purposes of the research.
• Snowball sampling: If the population is hard to access, snowball sampling can be
used to recruit participants via other participants. The number of people you have
access to “snowballs” as you get in contact with more people.
HYPOTHESIS TESTING:
The general goal of a hypothesis test is to rule out chance (sampling error) as a plausible
explanation for the results from a research study.
Hypothesis testing is a technique used to help determine whether a specific treatment has an
effect on the individuals in a population.
The null and alternative hypotheses offer competing answers to your research question. When
the research question asks “Does the independent variable affect the dependent variable?”:
The null hypothesis (H0) answers “No, there’s no effect in the population.” While the alternative
hypothesis (Ha) answers “Yes, there is an effect in the population.”
If the sample provides enough evidence against the claim that there’s no effect in the population
(p ≤ α), then we can reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.
Measures of Central Tendency:
3 measures of central tendency are commonly used in statistical analysis - MEAN, MEDIAN,
and MODE.
• Formula for Mean: X = (Σ x (summation of all samples)/N (number of samples)
• Median: Used to find middle value (center) of a distribution.
• Mode: Used when the most typical (common) value is desired. When there are two
modes, we say the distribution is bimodal.
Measures of Variability

s 2
=
(X − X ) 2

=
28
= 4.67
Variance (σ2) = N 6

Standard Deviation (σ) = s= s2 = 4.67 = 2.16

Mean absolute deviation = (X (mean) – Xi (Individual data)) / N (number of samples).


NORMAL DISTRIBUTION
Normal distribution, also known as the Gaussian distribution, is a probability distribution that is
symmetric about the mean, showing that data near the mean are more frequent in occurrence than
data far from the mean.
Properties of a Normal Distribution
A normal distribution has the following properties:
• The mean, median, and mode are equal.
• The normal curve is bell shaped and is symmetric about the mean.
• The total area under the normal curve is equal to 1.
• The normal curve approaches, but never touches, the x-axis as it extends farther and
farther away from the mean.
• The points at which the curve changes from curving upward to curving downward are
called inflection points
THE STANDARD NORMAL DISTRIBUTION
The standard normal distribution, also called the z-distribution, is a special normal
distribution where the mean is 0 and the standard deviation is 1. Any normal distribution can be
standardized by converting its values into z scores. Z scores tell you how many standard
deviations from the mean each value lies.

How to calculate a z score


To standardize a value from a normal distribution, convert the individual value into a z-score:

1. Subtract the mean from your individual value.


2. Divide the difference by the standard deviation.

Z-score formula Explanation

• x = individual value

• μ = mean

• σ = standard deviation

Correlation

The Correlation Analysis is the statistical tool used to study the closeness of the relationship
between two or more variables. The variables are said to be correlated when the movement of
one variable is accompanied by the movement of another variable. Karl Pearson’s Co-efficient
of Correlation. Karl Pearson’s method, popularly known as Pearsonian co-efficient of
correlation. The Pearsonian co-efficient of correlation is represented by the symbol r.
Difference
Between Correlation Coefficient and Regression Analysis
The Correlation coefficient measures the “degree of relationship” between variables, say X and
Y whereas the Regression analysis studies the “nature of relationship” between the variables.
Correlation coefficient does not clearly indicate the cause-and-effect relationship between the
variables, i.e. it cannot be said with certainty that one variable is the cause, and the other is the
effect. Whereas, the Regression analysis clearly indicates the cause-and-effect relationship
between the variables.

You might also like