0% found this document useful (0 votes)

2 views

Basic Biostatistics

The document discusses basic biostatistics and its role in epidemiology, emphasizing the importance of summarizing and analyzing data through various methods such as tables, graphs, and statistical tests. It covers different types of data, measures of central tendency and variability, and introduces key statistical methods like t-tests, chi-squared tests, correlation, and regression. Additionally, it highlights the significance of confidence intervals, hypothesis testing, and meta-analysis in drawing conclusions from data.

Uploaded by

boluwatifedareowolabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Basic Biostatistics

Uploaded by

boluwatifedareowolabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

BASIC BIOSTATISTICS:

CONCEPTS AND TOOLS

Basic Epidemiology, chapter 4
Summarizing data
• In epidemiology, biostatistics helps to:
• Summarize and analyze data
• Test and verify hypothesis
• Data can either be:
• Numeric
• e.g. age, measurements of height, weight, etc.
• Categorical
• Categorical data is gotten through classification
• e.g. classification of individuals based on blood groups: A, B, AB, or O.
• Ordinal data (data expressed as ranks like socioeconomic status: low, middle, high)
• In case of large amounts of data, summarization is needed to
draw appropriate conclusions
• Data summary can be done with tables and graphs
Summarizing data-tables and graphs
• Tables and graphs are made to display data in a way that be easily under-
stood
• Every table and graph must have enough information to be interpreted
without reference to the text
• Table
Titles should clearly describe the contents of the tables or graphs being pre-
1: Advantages of Tables and Graphs
sented Advantages of Tables Advantages of Graphs
1. Helps to display more complex data with Simple and easy to understand
precision and flexibility
2. Requires less technical skill to prepare Makes use of vivid images that encourages memory
3. Takes less space for a given information Gives the ability to show complex relationships be-
tween variables

• There are different types of graphs:

• Pie charts and component bar charts
• Spot maps and rate maps
• Bar charts
Pie charts and component band charts
• Pie charts and component band charts show how an entity is divided into its
constituent parts.
• A pie chart represents information in a circle, while a component band chart
uses bands
• When there are two or more whole entities to be divided into components, it
is better to use a component band chart

Fig. 1 Pie chart number of deaths by cause among 25–34 and 35- Fig. 2 Component band chart showing number of deaths by
44 year olds — united states, 2003 cause among 25–44 year olds — united states, 1997 and
Spot and rate maps
• They are maps that show the geographical distribution of dis-
ease or other epidemiological events
• Rate maps show:
• Differences in value of cases according to geographical locations
• Prevalence
• Incidence
• mortality
• Spot and rate maps can show data in both a static and interac-
tive form e.g. The World Health Chart
Table 2: Functions of spot and rate maps
Spot maps Rate maps
Show the locations of individual cases Show the distribution of rate (i.e. disease rate)
across different areas
Each point represents a single case Uses colors or shading to show differences in
rates across regions
Spot and rate maps

Fig. 3. Spot map of bacterial meningitis Fig. 4. Rate maps of the world showing incidence
cases, upper west region, 2018–2020. rates and mortality rates of thyroid cancer among
women
Bar charts and line graphs

• Bar charts compare two

• Line graphs display differences in
or more categories of the value of continuous variables
data
• Bar charts convey data
with the varying bar
Frequency distributions and histograms

• A frequency distribution shows how often each value in a

dataset occurs
• Frequency distributions are often in tabular forms
• A histogram is a visual representation of frequency distri-
butions
• A frequency polygons is a line that connects the middle of
each of the bars of the histogram (e.g The bell-shaped
curve of a normal distribution
Normal distribution

• The normal distribution is a bell-shaped

probability distribution
• The mean, median, and mode are all
equal
• Most of the data cluster around the mean
• ~68% of data falls within 1SD of the
mean
• ~95% of the data falls within 2SD of the
mean
• ~99.7% of the data falls within 3SD of
the mean
• Normal distributions are important be-
cause:
• It models real-world data
• It allows the use of key statistical methods
• Statistical methods like linear regressions, T-
tests, ANOVA assume data follows a normal
Summary numbers-measures of central
tendency
• Measures of central tendency
• Mean
• The sample average
• For a sample with n values, for a variable x, the sample mean will be:

• Median
• The value of the middle after all the measurements have been put in
order
• Mode
• The value of the measurement in a sample that occurs most frequently
Measures of variability

• Measures how different individual data points in a sam-

ple are
• Useful for :
• generalizing about a population
• Identifying outliers
• Comparing different data sets
• The most useful measures of variability are:
• Variance
• Standard deviation
• Standard error of mean
Measures of variability

√
𝑛

∑ ( 𝑥𝑖 − 𝑥 )2
Standard devia- 𝜎 = 𝑖 =1

tion 𝑛− 1
• Standard error Measure of potential error 2
• Estimates efficiency, accuracy and consistency of var ⅈ 𝑎𝑛𝑐 ⅇ =𝜎
a sample
• The higher the SE, the lower the reliability 𝜎
𝑆𝐸 ( 𝜎 𝑥 ) =
• Standard deviation √𝑛
• Measures how far apart values are from the
mean. Low SD indicates that the values are
closer to the mean
• Square root of the variance
• Variance
• Measures the average degree to which each
value is different from the mean
• Square of the standard deviation
Basic concepts of statistical inference

• Random sample data are used to make conclusions about

a population
• These conclusions are made in terms of summarizing num-
bers.
• Summary numbers for population are represented by
Greek letters

• Estimates of these parameters obtained from a sample are

represented by x , s and b,
Using samples to understand populations
• To make statistical inferences, selecting a
random sample is important
• Every member of a population as an equal
chance in a random sample
• If sample mean is the same as population
mean, then it is an unbiased representa-
tion of the population
• Sample sizes must be large enough
for a study to have statistical power
• Sample sizes are calculated based
on the following:
• Prevalence
• Acceptable error
• Detectable difference
Confidence intervals

• Probability that a statistical value will fall between two set

values-the upper and lower bounds
• Represents how much uncertainty is in a statistic
• Shows variation in a statistic if a study is repeated several times
• Can be used to test a hypothesis
• The most ideal C.Is are >=95%
• There is a 95% chance the value is within the upper and lower
bounds
Calculating confidence intervals
• To calculate the confidence interval, the following mea-
surements are needed:
• Upper bound and lower bounds
• Sample size
• Mean
• Standard deviation
• A constant
• When n=10, = 67.9, SD=10.2
lower bound: ( - (1.96)s/)

Upper bound + (1.96)s/ )

The resulting confidence interval is: C(LB < < UB)
Hypothesis tests, p-values, statistical power

• In biostatistics, hypothesis testing involves

putting assumptions about a population pa-
rameter to the test
• In testing a hypothesis, it is important to:
• Make a careful statement concerning the
hypothesis to be tested
• Know the p-value associated with the
test Fig. showing possible results of a hypothesis
• Know the statistical power of the test test
P-value
• A measure that helps determine how strong the evi-
dence against a null hypothesis is
• P-value helps to determine if data supports or contra-
dicts the null hypothesis
• If p-value is smaller than a pre-defined threshold, it
suggests that:
• The null hypothesis should be rejected
• The data from the tests conducted is unlikely to have been
caused by chance alone
• There is evidence to support an alternative hypothesis
Statistical power
• The ability of a statistical test to detect an effect when it
truly exists
• It measures the likelihood of rejecting the null hypothe-
sis correctly when the alternative is true
Basic biostatistics methods in epidemiology

• t-tests
• chi square tests
• correlation
• regression
T-tests
• Tests if two means differ significantly under the null hypothe-
sis
• Helps to understand if the difference observed is due to
chance
• There are different types of T-tests:
• Independent samples t-tests: compares means of two separate
and unrelated groups (e.g. comparing mean ages between two dif-
ferent populations)
• Paired samples t-tests: used for two sets of measurements that
are paired. Takes dependency and relationship between measure-
ments into consideration (e.g. comparing the mean blood pressure
in the same population before and after medication)
• One-sample T-tests: compares the mean of a group to a known or
hypothesized value (e.g. comparing level of pesticides in a popula-
tion compared to the government-approved limits)
Chi-squared tests for cross tabula-
tions
• Chi-squared test is a statistical analysis used to determine if
there is a significant association between two categorical
data
• To perform this test, a cross-tabulation or contingency table
is created
• χ²= Σ [(Observed frequency - Expected frequency)²/ Ex-
pected frequency]
• After calculating chi-squared statistic, it is compared to the
critical value from the chi-squared distribution
• If calculated chi-square is greater than critical value, null
hypothesis is rejected
Correlation
• Quantifies the degree to which two
variables vary together
• results relating to correlation can be:
• Positive
• Negative
• No correlation
• Corelation is typically measured with
correlation coefficient
• The most common is the Pearson cor-
relation coefficient, r.
• r ranges from -1 to +1
• If r is close to +1, positive correlation
• If r is close to -1, negative correlation
• If r is close to 0, zero or weak corre-
lation
• To visualize correlation, spot maps
are the best
Regression
• Statistical method used to examine relationship be-
tween a dependent variable and one or more indepen-
dent variables
• To understand how a change in the independent vari-
able is related to a change in the dependent variable
• Regression models help to estimate values of a depen-
dent variable based on the independent variable
• Linear regression
• Logistic regression
• Cox proportional hazards regression
Linear regression
• Linear regression assumes a straight line relationship between
the dependent and independent variable.
• Mathematically, this model is expressed as:
Y = b0 + b1*X + ε
where:
Y = the dependent variable
X =independent variable
b0 is the y-intercept, which represents the predicted value of Y when X is
zero.
b1 is the slope of the line, indicating how much Y is expected to change for a
one-unit increase in X.
Logistic regression
• Analyses the relationship between a categorical depen-
dent variable and one or more independent variables
• Used for situations where the dependent variable can
take on binary values
• In logistic regression, the relationship between the in-
dependent variables and the dependent variable is
modeled using the logistic or sigmoid function
• Logistic regression estimates the odds ratio, It mea-
sures the ratio of the probability of success to the prob-
ability of failure
Survival analyses and Cox propor-
tional hazards models
• To investigate survival time of patients and predictor
variables (covariates)
• It is a multivariate statistical model
• h(t) = h0(t)*exp(b1x1 + b2x2 + ... + bpxp)
• In this model,
• t represents survival time
• h(t) is the hazard function which is determined by covariates (x1, x2, ...,
xp )
• x1, x2, ..., xp measures the impact of the covariates
• h0 is the baseline hazard
• Censoring affects computation of cox-proportional mod-
els
Kaplan-Meier survival curves
• Used to display time-to-event data especially survival
data
• Proportion range from 1.0 (or 100%) to 0.0 (or 0%)
• Solves the problem of censoring in statistics
• Used in medical field to analyze:
• effectiveness of treatments
• Survival rate of participants
• How to create a Kaplan-Meier survival curve:
• Identify the starting point
• Observe the event
• Calculate the probability of survival
• Plot the curve
Kaplan-meier survival curve
Meta-analysis
• Statistical analysis combining the result of separate
but comparable results
• Used in order to identify an overall trend
• Different from other studies-no new data is collected
• Steps for a successful meta-analysis:
• formulating the problem and study design;
• identifying relevant studies;
• excluding poorly conducted studies or those with
major methodological flaws;
• measuring, combining and interpreting the re-
sults.
Reason for the surge in
Meta-analysis
• ethical reasons,
• cost issues
• the need to have an overall idea of effects in dif-
ferent population
• To make conclusive judgements from aggregate
studies when sample size for a single study is too
small

Education - Training 33206
100% (2)
Education - Training 33206
3,304 pages
General Notes & Standard Details
100% (1)
General Notes & Standard Details
11 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
Miller, Daniel. Introduction - Clothing As Material Culture
No ratings yet
Miller, Daniel. Introduction - Clothing As Material Culture
12 pages
Prof DR Amit Gupta AITSC Trauma Registry & Trauma Quality Improvement in India
67% (3)
Prof DR Amit Gupta AITSC Trauma Registry & Trauma Quality Improvement in India
77 pages
Biostat Aguila Mission Solis (1)
No ratings yet
Biostat Aguila Mission Solis (1)
44 pages
WK 1b Biostat
No ratings yet
WK 1b Biostat
38 pages
Biostatistics and Epidemiology LAB
No ratings yet
Biostatistics and Epidemiology LAB
13 pages
BRM ANSWER KEY Q BANK BY ALAM.
No ratings yet
BRM ANSWER KEY Q BANK BY ALAM.
90 pages
Biostatistics - i
No ratings yet
Biostatistics - i
46 pages
Introduction To Bio Statistics
No ratings yet
Introduction To Bio Statistics
53 pages
Session 3 Week 2
No ratings yet
Session 3 Week 2
31 pages
Summarizing Data
No ratings yet
Summarizing Data
67 pages
Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)
No ratings yet
Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)
118 pages
Descriptive Statistics, Tables and Graphs 20
No ratings yet
Descriptive Statistics, Tables and Graphs 20
34 pages
1 Introduction
No ratings yet
1 Introduction
97 pages
2statsnotes 1
No ratings yet
2statsnotes 1
24 pages
Statistics For Everyone Workshop Fall 2010
No ratings yet
Statistics For Everyone Workshop Fall 2010
47 pages
Bio Statistics
No ratings yet
Bio Statistics
97 pages
Biostatistics in
No ratings yet
Biostatistics in
75 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Bio Statistics
No ratings yet
Bio Statistics
55 pages
Introduction To Data Viz Lecture 2
No ratings yet
Introduction To Data Viz Lecture 2
44 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Understandingstatisticsinresearch 151026064600 Lva1 App6892
No ratings yet
Understandingstatisticsinresearch 151026064600 Lva1 App6892
37 pages
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
No ratings yet
Lecture 01 Introduction to Statistics Ppt 06022025 095924am
40 pages
Introduction To Statistics Reviewer
No ratings yet
Introduction To Statistics Reviewer
4 pages
Intro To Statistics
No ratings yet
Intro To Statistics
35 pages
Lecture 2_Descriptive Statistics
No ratings yet
Lecture 2_Descriptive Statistics
53 pages
1 - 3 - 4 - Class1 - Descriptive Statistics - 4slines - 1trang
No ratings yet
1 - 3 - 4 - Class1 - Descriptive Statistics - 4slines - 1trang
99 pages
Basics of Statistics
No ratings yet
Basics of Statistics
40 pages
Basic Concepts in Biostatistics-1
No ratings yet
Basic Concepts in Biostatistics-1
40 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
RVO-STATISTICS - Statistics - Introduction To Statistics IBBI
No ratings yet
RVO-STATISTICS - Statistics - Introduction To Statistics IBBI
93 pages
MODULE 1 Introduction To BIOSTAT
No ratings yet
MODULE 1 Introduction To BIOSTAT
49 pages
Statistics: An Introduction and Overview
No ratings yet
Statistics: An Introduction and Overview
51 pages
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
32 pages
Biostatistics..and Orthodontics
No ratings yet
Biostatistics..and Orthodontics
99 pages
M 301 - Ch1 - Introduction To Statistics
No ratings yet
M 301 - Ch1 - Introduction To Statistics
96 pages
Biostatistics CN
No ratings yet
Biostatistics CN
79 pages
Biostatistics Notes
No ratings yet
Biostatistics Notes
8 pages
Biostatistics Notes: Descriptive Statistics
No ratings yet
Biostatistics Notes: Descriptive Statistics
16 pages
SMA 140 Lectures Notes 2024 Sep
No ratings yet
SMA 140 Lectures Notes 2024 Sep
87 pages
Basic Biostats Part
No ratings yet
Basic Biostats Part
59 pages
1 Introduction To Biostatistics
No ratings yet
1 Introduction To Biostatistics
54 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
A Brief (Very Brief) Overview of Biostatistics: Jody Kreiman, PHD Bureau of Glottal Affairs
No ratings yet
A Brief (Very Brief) Overview of Biostatistics: Jody Kreiman, PHD Bureau of Glottal Affairs
56 pages
Document 1
No ratings yet
Document 1
7 pages
01_Introduction to Statistics
No ratings yet
01_Introduction to Statistics
24 pages
Math
No ratings yet
Math
13 pages
Picturing Distributions With Graphs
No ratings yet
Picturing Distributions With Graphs
21 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
1 - Introduction To Statistics
No ratings yet
1 - Introduction To Statistics
34 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Bio Statistics
No ratings yet
Bio Statistics
72 pages
MANM526-W1
No ratings yet
MANM526-W1
38 pages
Data, Variables & Presentation
No ratings yet
Data, Variables & Presentation
39 pages
(Ebook PDF) Intermediate Social Statistics: A Conceptual and Graphic Approach 2024 Scribd Download
100% (15)
(Ebook PDF) Intermediate Social Statistics: A Conceptual and Graphic Approach 2024 Scribd Download
53 pages
01 - Introduction To Statistics
No ratings yet
01 - Introduction To Statistics
38 pages
Community MCQ
50% (2)
Community MCQ
271 pages
1st Mid
No ratings yet
1st Mid
19 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Exponent Rules & Practice PDF
No ratings yet
Exponent Rules & Practice PDF
2 pages
AICTE activity point - report format
No ratings yet
AICTE activity point - report format
8 pages
Science Questions For SSC CGL and MTS Exam 2017: Chemistry/Biology/Physics
No ratings yet
Science Questions For SSC CGL and MTS Exam 2017: Chemistry/Biology/Physics
4 pages
Determining Percent Yield in A Chemical Reaction Lab
No ratings yet
Determining Percent Yield in A Chemical Reaction Lab
2 pages
Chapter 10 Test
No ratings yet
Chapter 10 Test
9 pages
VRF Catalog 2017-02-08
No ratings yet
VRF Catalog 2017-02-08
24 pages
Connector Definitions
No ratings yet
Connector Definitions
21 pages
Manual Nice Naked Sliding - Nice
No ratings yet
Manual Nice Naked Sliding - Nice
152 pages
Big Bazaar
No ratings yet
Big Bazaar
94 pages
Recruitment: By: K.C.Pattanaik Regd:1561301024
100% (1)
Recruitment: By: K.C.Pattanaik Regd:1561301024
22 pages
EMT Lecture Notes TEC
No ratings yet
EMT Lecture Notes TEC
139 pages
Sheet9 Sol
No ratings yet
Sheet9 Sol
11 pages
Semantics
No ratings yet
Semantics
6 pages
Maths Year 10 Book 1
100% (1)
Maths Year 10 Book 1
98 pages
Aravali43 School Static 1623941251621 DATESHEET AND SYLLABUS PT1 GRADE X
No ratings yet
Aravali43 School Static 1623941251621 DATESHEET AND SYLLABUS PT1 GRADE X
1 page
"Branding & Promotion": Project Report ON
No ratings yet
"Branding & Promotion": Project Report ON
81 pages
Fully Reused VLSI Architecture of FM0 / Manchester Encoding Using SOLS Technique For DSRC Applications Chapter-1
No ratings yet
Fully Reused VLSI Architecture of FM0 / Manchester Encoding Using SOLS Technique For DSRC Applications Chapter-1
60 pages
Any Kinds of Graphs
No ratings yet
Any Kinds of Graphs
1 page
WeHo Smart City Strategic Plan
100% (1)
WeHo Smart City Strategic Plan
48 pages
OPERCOM
100% (2)
OPERCOM
2 pages
Tugas 3 - Bahasa Inggris
100% (2)
Tugas 3 - Bahasa Inggris
3 pages
Bootable USB
No ratings yet
Bootable USB
5 pages
Solar Power Generation - Technology, New Concepts & Policy
No ratings yet
Solar Power Generation - Technology, New Concepts & Policy
249 pages
Good Marx Syllabus
100% (2)
Good Marx Syllabus
8 pages
History: Sanskrit India Guru
No ratings yet
History: Sanskrit India Guru
35 pages
2012 Offshore en Web
No ratings yet
2012 Offshore en Web
4 pages

Basic Biostatistics

Uploaded by

Basic Biostatistics

Uploaded by

BASIC BIOSTATISTICS:

CONCEPTS AND TOOLS

• There are different types of graphs:

• Bar charts compare two

• A frequency distribution shows how often each value in a

• The normal distribution is a bell-shaped

• Measures how different individual data points in a sam-

• Random sample data are used to make conclusions about

• Estimates of these parameters obtained from a sample are

• Probability that a statistical value will fall between two set

Upper bound + (1.96)s/ )

• In biostatistics, hypothesis testing involves

You might also like