Variables
Variables
Categorical variables represent data that can be divided into distinct groups
or categories. They can be further classified into nominal and ordinal
variables.
Interval variables are numerical variables that have equal intervals between
values, but no true zero point. Examples include temperature in Celsius or
Fahrenheit, where the difference between 20°C and 30°C is the same as
between 30°C and 40°C, but 0°C does not indicate an absence of
temperature.
Ratio variables are numerical variables that have equal intervals between
values and a true zero point, which allows for meaningful ratios. Examples
include height, weight, age, and income. For instance, a weight of 0 means
no weight, and a weight of 10 kg is twice as much as 5 kg.
5. *Confounding Variables*:
6. *Moderator Variables*:
7. *Time Variables*:
NORMAL DISTRIBUTION
Normal distribution, also known as the Gaussian distribution, is a probability
distribution that appears as a “bell curve” when graphed. The normal
distribution describes a symmetrical plot of data around its mean value,
where the width of the curve is defined by the standard deviation.
Formula:
X = value of the variable or data being examined and f(x) the probability
function
μ = the mean
Empirical rule
Around 68.2% of values are within 1 standard deviation from the mean.
Around 95.4% of values are within 2 standard deviations from the
mean.
Around 99.7% of values are within 3 standard deviations from the
mean.
The Central Limit Theorem (CLT) states that the sum (or average) of a large
number of independent, identically distributed variables will be
approximately normally distributed, regardless of the original distribution of
the variables. This is crucial for inferential statistics, as it allows for the use
of normal distribution assumptions in hypothesis testing and confidence
interval estimation.
Applications in Biostatistics:
Examples
Positive Skewness (Right Skewness): When the tail on the right side of
the distribution is longer or fatter than the left side. The bulk of the values lie
to the left of the mean, and the mean is typically greater than the median.
Negative Skewness (Left Skewness): When the tail on the left side of the
distribution is longer or fatter than the right side. The bulk of the values lie to
the right of the mean, and the mean is typically less than the median.
**Limitations of Skewness:**
- **Positive Skewness**: The peak (mode) is to the left of the center, with
the right tail being longer.
3. Relation to mode:
4. Concentration of data:
5. Outliers:
7. Common examples:
8. Shape analogy:
9. Mathematical representation:
KURTOSIS
Kurtosis is a statistical measure that describes the shape of a distribution’s
tails in relation to its overall shape, specifically the “tailedness” or the
propensity for producing outliers. It provides insight into the data’s
distribution, indicating how much of the data is in the tails and the peak of
the distribution.
1. **Types of Kurtosis**:
**Importance of Kurtosis:**
2. Risk Assessment
3. Distribution Shape Insight
Limitations of Kurtosis:
2. Complex Interpretation
3. Outlier Sensitivity