0% found this document useful (0 votes)
21 views

Survival Analysis - lecture 3

The document discusses the Log-Rank Test, a statistical method for comparing survival distributions among groups, particularly in clinical trials. It explains the test's benefits, such as handling censored data and its non-parametric nature, and provides examples of its application. Additionally, it covers stratification in survival analysis and Fisher's Exact Test for assessing associations between binary variables.

Uploaded by

alcinialbob1234
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Survival Analysis - lecture 3

The document discusses the Log-Rank Test, a statistical method for comparing survival distributions among groups, particularly in clinical trials. It explains the test's benefits, such as handling censored data and its non-parametric nature, and provides examples of its application. Additionally, it covers stratification in survival analysis and Fisher's Exact Test for assessing associations between binary variables.

Uploaded by

alcinialbob1234
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

Group 3

The Team
Youssef Sayed Muhamad Shokry Assem Khalifa

Naira Magdy Salam Essam Basel Mohamed

Mahmoud Abdel Fattah Mohamed Anter


Introduction
We will talk about
What is our goal?
What is the survival time?
What are the defined functions of survival time?
What is the Kaplan-Meier estimator?
Lecture Motivation
Log-Rank Test
What is the Log-Rank Test?
The Log-Rank Test is a statistical method used to compare the
survival distributions of two or more groups.
It assesses whether there are significant differences in survival
times among different treatment or intervention groups.
The test is widely used in clinical trials, especially in oncology
and other medical research fields.
Why to Use Log-Rank Test?
Time-to-Event Comparisons: Use it when comparing the time
it takes for an event (e.g., death, recovery, failure) to occur in
different groups.
Simplicity: Easy to implement and interpret, making it
accessible for researchers.
Non-Parametric: Does not assume a specific survival time
distribution, making it suitable for various data types.
Why to Use Log-Rank Test?
Handles Censored Data: accommodates censored
observations (e.g., patients lost to follow-up), allowing for
comprehensive analysis.
Focus on Survival Curves: The most common method for
comparing survival curves, helping determine if one group has
better survival outcomes than another.
Example of Log-Rank Test
Clinical Trial: Comparing the survival times of patients
treated with Drug A vs. Drug B.
Heart Disease Study: Comparing survival rates of patients
who underwent surgery vs. those on medication.
Smoking and Lung Disease: Comparing survival times of
smokers vs. non-smokers after being diagnosed with lung
disease.
How Does the Log-Rank Test Work?
Calculation: It compares the observed number of events (e.g.,
deaths) in each group to the expected number of events
under the null hypothesis of no difference in survival.
Example Scenario
You’re conducting a clinical trial to compare the survival times of
patients receiving Treatment A and Treatment B for a specific
type of cancer. You follow the patients for a period of time,
recording their survival times and noting if they were censored
(meaning the patient either didn’t experience the event by the
end of the study or was lost to follow-up).
Example Data
Calculate Expected Events at Each Time Point
Calculate Expected Events at Each Time Point
Calculate Expected Events at Each Time Point
Calculate Expected Events at Each Time Point
Apply Log-Rank Test Formula
Apply Log-Rank Test Formula
Apply Log-Rank Test Formula
Apply Log-Rank Test Formula
Apply Log-Rank Test Formula
Calculate Total Log-Rank Statistic
Conclusion
The Log-Rank Statistic (0.476) is compared to a chi-square
distribution with 1 degree of freedom. Using statistical tables
or software, you would find the p-value.

If the p-value is below a significance level (typically 0.05),


you would reject the null hypothesis, concluding there is a
significant difference between the survival curves of Group A
and Group B. Otherwise, you fail to reject the null hypothesis,
indicating no significant difference.
Chi-Square Table
Why Use Chi-Square?
Chi-Square is a mathematical test used to measure how much
the observed events differ from the expected events.
Why it’s important: It helps us decide if the difference
between groups is just due to chance, or if it’s a real
difference.
Log-Rank Test in R
Log-Rank Test in R
Log-Rank Test in R
Log-Rank Test in R
Stratification in Survival
Analysis
What is Stratification?
Definition: Stratification is the process of dividing a study
population into subgroups based on characteristics like age,
gender, or risk factors.
Purpose: It helps improve analysis accuracy by eliminating the
effect of confounding variables that could influence survival
outcomes.
Example of Stratification
Scenario:
Group 1: Patients receiving a new treatment.
Group 2: Patients receiving a standard treatment.

Objective: Compare the survival curves of these two groups.


Why Do We Stratify?
Remove Confounding Effects: Ensures factors like age or
treatment are evenly distributed across groups.
Increase Precision: Allows for detailed comparison of survival
curves within subgroups.
Improve Validity: Ensures the difference observed is due to
the factor studied and not other variables.
Example Data
Steps of Analysis
Step 1: Stratify Patients by Age
We divide the patients into two age groups:
Age Group 1: Patients younger than 50 years.
Age Group 2: Patients 50 years and older.
Step 2: Analyze Age Group 1 (<50 years)

Result: In this age group, patients receiving treatment B show


longer survival times compared to those receiving treatment A.
Step 3: Analyze Age Group 2 (≥50 years)

Result: In this age group, it appears that patients receiving


treatment A may have longer survival times compared to those
receiving treatment B.
Step 4: Stratified Log-Rank Test
We use the stratified log-rank test to compare the survival
curves for treatments A and B, accounting for the effect of
age. Here is the complete R code that implements all the
steps:
p-value (p = 0.5):
A p-value of 0.5 is quite high, suggesting that there is no
statistically significant difference in survival between the two
treatments (A and B).
Final Conclusions for Age Groups
Age Group < 50 Years:
Treatment B may be more effective.
Age Group ≥ 50 Years:
Treatment A appears to be more effective.
Fisher’s Exact Test
Fisher’s Exact Test
Definition: A statistical hypothesis test used to assess the
association between two binary variables in a contingency
table and is particularly useful when working with small sized
samples. It is a non-parametric test, meaning it assumes no
distribution in the data
What is the contingency table?
Is a statistical tool that displays the frequency [count] of
different combination of two categorical variables.
Each cell in the table shows how many times a specific
combination occurs
What is the contingency table?
Expected Cell Counts
Fisher's Exact Test is typically used when:
At least 20% of the expected cell counts are less than 5.
This means that if you have a 2x2 contingency table, at least 1
out of the 4 cells can have an expected count of less than 5.
Assumptions
Categorical and Binary Variables: Both variables must be
categorical and binary, allowing for a 2x2 contingency table.
Independence: Data should be randomly selected from
independent samples with no relationship between groups,
ensuring observations fall into only one category.
Assumptions
Sample Size: If any cell in the contingency table has a count
less than 5, use Fisher's Exact Test; if all counts are 5 or
greater, a Chi-squared test is more appropriate.
Row and Column Totals: The row and column totals are fixed
and not random.
Fisher's Exact Test Steps
1.State the null and alternative
hypotheses
2.Populate the contingency table and
calculate the exact probability
The formula for the exact probability (P) of the observed
table is
Check the condition
If the sample size is greater than or equals to 20, We
should compute the Expected cell count for each cell, if it
is greater than or equals to 5, we then compute the cells
ratio for that cell. The cell ratio must be less than 80% of
the cells.
Expected Cells Counts Formula
The Cells Ratio Formula
3.Compute the other possible the contingency
tables and their exact probabilities
when p1 and P2 is smaller than or equals to P0 we repeat this
step twice or until we reach 0 to take the summation of 3 values
only.
4.Compute the p-value
5.Conclusion and Decision
Example
Let’s take an example where we are interested in the association
between coffee consumption (binary exposure, whether the
people in a study drink coffee, yes/no) and cancer (binary
outcome variable, whether people in the study developed
cancer, yes/no) with 5% level of significance.
Solution
Solution
Since n is 20, we need to check the Expected Cells Counts
and The Cells Ratio
Solution
Step 1
The null hypothesis (H0): Is that there is no association
between coffee consumption and cancer in the study.
The alternative hypothesis (H1): Is that there is an
association between coffee consumption and cancer.
Solution
Step 2
Solution
Step 3
Solution
P1 must be less than or equals to P0, so we will not take this value
Solution
Solution
Solution
Step 4
Solution
Step 5
Fisher’s Exact test using R
Fisher’s Exact test using R
THANK
YOU

You might also like