100% found this document useful (23 votes)

105 views

Regression Modeling Strategies: With Applications To Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer Series in Statistics) - ISBN 3319194240, 978-3319194240

ISBN-10: 3319194240. ISBN-13: 978-3319194240. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer Series in Statistics) Full PDF DOCX Download

Uploaded by

tameratamray

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (23 votes)

105 views

Regression Modeling Strategies: With Applications To Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer Series in Statistics) - ISBN 3319194240, 978-3319194240

Uploaded by

tameratamray

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Regression Modeling Strategies: With Applications to Linear

Models, Logistic and Ordinal Regression, and Survival

Analysis (Springer Series in Statistics)

Visit the link below to download the full version of this book:
https://cheaptodownload.com/product/regression-modeling-strategies-with-applicat
ions-to-linear-models-logistic-and-ordinal-regression-and-survival-analysis-spri
nger-series-in-statistics-full-pdf-download/
Frank E. Harrell, Jr.

Regression Modeling
Strategies
With Applications to Linear Models,
Logistic and Ordinal Regression,
and Survival Analysis

Second Edition

123
Frank E. Harrell, Jr.
Department of Biostatistics
School of Medicine
Vanderbilt University
Nashville, TN, USA

ISSN 0172-7397 ISSN 2197-568X (electronic)

Springer Series in Statistics
ISBN 978-3-319-19424-0 ISBN 978-3-319-19425-7 (eBook)
DOI 10.1007/978-3-319-19425-7

Library of Congress Control Number: 2015942921

Springer Cham Heidelberg New York Dordrecht London

© Springer Science+Business Media New York 2001
© Springer International Publishing Switzerland 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www.

springer.com)
To the memories of Frank E. Harrell, Sr.,
Richard Jackson, L. Richard Smith, John
Burdeshaw, and Todd Nick, and with
appreciation to Liana and Charlotte
Harrell, two high school math teachers:
Carolyn Wailes (née Gaston) and Floyd
Christian, two college professors: David
Hurst (who advised me to choose the ﬁeld
of biostatistics) and Doug Stocks, and my
graduate advisor P. K. Sen.
Preface

There are many books that are excellent sources of knowledge about
individual statistical tools (survival models, general linear models, etc.), but
the art of data analysis is about choosing and using multiple tools. In the
words of Chatfield [100, p. 420] “. . . students typically know the technical de-
tails of regression for example, but not necessarily when and how to apply it.
This argues the need for a better balance in the literature and in statistical
teaching between techniques and problem solving strategies.” Whether ana-
lyzing risk factors, adjusting for biases in observational studies, or developing
predictive models, there are common problems that few regression texts ad-
dress. For example, there are missing data in the majority of datasets one is
likely to encounter (other than those used in textbooks!) but most regression
texts do not include methods for dealing with such data effectively, and most
texts on missing data do not cover regression modeling.
This book links standard regression modeling approaches with
• methods for relaxing linearity assumptions that still allow one to easily
obtain predictions and confidence limits for future observations, and to do
formal hypothesis tests,
• non-additive modeling approaches not requiring the assumption that
interactions are always linear × linear,
• methods for imputing missing data and for penalizing variances for incom-
plete data,
• methods for handling large numbers of predictors without resorting to
problematic stepwise variable selection techniques,
• data reduction methods (unsupervised learning methods, some of which
are based on multivariate psychometric techniques too seldom used in
statistics) that help with the problem of “too many variables to analyze and
not enough observations” as well as making the model more interpretable
when there are predictor variables containing overlapping information,
• methods for quantifying predictive accuracy of a fitted model,

vii
viii Preface

• powerful model validation techniques based on the bootstrap that allow the
analyst to estimate predictive accuracy nearly unbiasedly without holding
back data from the model development process, and
• graphical methods for understanding complex models.
On the last point, this text has special emphasis on what could be called
“presentation graphics for fitted models” to help make regression analyses
more palatable to non-statisticians. For example, nomograms have long been
used to make equations portable, but they are not drawn routinely because
doing so is very labor-intensive. An R function called nomogram in the package
described below draws nomograms from a regression fit, and these diagrams
can be used to communicate modeling results as well as to obtain predicted
values manually even in the presence of complex variable transformations.
Most of the methods in this text apply to all regression models, but special
emphasis is given to some of the most popular ones: multiple regression using
least squares and its generalized least squares extension for serial (repeated
measurement) data, the binary logistic model, models for ordinal responses,
parametric survival regression models, and the Cox semiparametric survival
model. There is also a chapter on nonparametric transform-both-sides regres-
sion. Emphasis is given to detailed case studies for these methods as well as
for data reduction, imputation, model simplification, and other tasks. Ex-
cept for the case study on survival of Titanic passengers, all examples are
from biomedical research. However, the methods presented here have broad
application to other areas including economics, epidemiology, sociology, psy-
chology, engineering, and predicting consumer behavior and other business
outcomes.
This text is intended for Masters or PhD level graduate students who
have had a general introductory probability and statistics course and who
are well versed in ordinary multiple regression and intermediate algebra. The
book is also intended to serve as a reference for data analysts and statistical
methodologists. Readers without a strong background in applied statistics
may wish to first study one of the many introductory applied statistics and
regression texts that are available. The author’s course notes Biostatistics
for Biomedical Research on the text’s web site covers basic regression and
many other topics. The paper by Nick and Hardin [476] also provides a good
introduction to multivariable modeling and interpretation. There are many
excellent intermediate level texts on regression analysis. One of them is by
Fox, which also has a companion software-based text [200, 201]. For readers
interested in medical or epidemiologic research, Steyerberg’s excellent text
Clinical Prediction Models [586] is an ideal companion for Regression Modeling
Strategies. Steyerberg’s book provides further explanations, examples, and
simulations of many of the methods presented here. And no text on regression
modeling should fail to mention the seminal work of John Nelder [450].
The overall philosophy of this book is summarized by the following state-
ments.
Preface ix

• Satisfaction of model assumptions improves precision and increases statis-

tical power.
• It is more productive to make a model fit step by step (e.g., transformation
estimation) than to postulate a simple model and find out what went
wrong.
• Graphical methods should be married to formal inference.
• Overfitting occurs frequently, so data reduction and model validation are
important.
• In most research projects, the cost of data collection far outweighs the cost
of data analysis, so it is important to use the most efficient and accurate
modeling techniques, to avoid categorizing continuous variables, and to
not remove data from the estimation sample just to be able to validate the
model.
• The bootstrap is a breakthrough for statistical modeling, and the analyst
should use it for many steps of the modeling strategy, including deriva-
tion of distribution-free confidence intervals and estimation of optimism
in model fit that takes into account variations caused by the modeling
strategy.
• Imputation of missing data is better than discarding incomplete observa-
tions.
• Variance often dominates bias, so biased methods such as penalized max-
imum likelihood estimation yield models that have a greater chance of
accurately predicting future observations.
• Software without multiple facilities for assessing and fixing model fit may
only seem to be user-friendly.
• Carefully fitting an improper model is better than badly fitting (and over-
fitting) a well-chosen one.
• Methods that work for all types of regression models are the most valuable.
• Using the data to guide the data analysis is almost as dangerous as not
doing so.
• There are benefits to modeling by deciding how many degrees of freedom
(i.e., number of regression parameters) can be “spent,” deciding where they
should be spent, and then spending them.
On the last point, the author believes that significance tests and P -values
are problematic, especially when making modeling decisions. Judging by the
increased emphasis on confidence intervals in scientific journals there is reason
to believe that hypothesis testing is gradually being de-emphasized. Yet the
reader will notice that this text contains many P -values. How does that make
sense when, for example, the text recommends against simplifying a model
when a test of linearity is not significant? First, some readers may wish to
emphasize hypothesis testing in general, and some hypotheses have special
interest, such as in pharmacology where one may be interested in whether
the effect of a drug is linear in log dose. Second, many of the more interesting
hypothesis tests in the text are tests of complexity (nonlinearity, interaction)
of the overall model. Null hypotheses of linearity of effects in particular are
x Preface

frequently rejected, providing formal evidence that the analyst’s investment

of time to use more than simple statistical models was warranted.
The rapid development of Bayesian modeling methods and rise in their use
is exciting. Full Bayesian modeling greatly reduces the need for the approxi-
mations made for confidence intervals and distributions of test statistics, and
Bayesian methods formalize the still rather ad hoc frequentist approach to
penalized maximum likelihood estimation by using skeptical prior distribu-
tions to obtain well-defined posterior distributions that automatically deal
with shrinkage. The Bayesian approach also provides a formal mechanism for
incorporating information external to the data. Although Bayesian methods
are beyond the scope of this text, the text is Bayesian in spirit by emphasizing
the careful use of subject matter expertise while building statistical models.
The text emphasizes predictive modeling, but as discussed in Chapter 1,
developing good predictions goes hand in hand with accurate estimation of
effects and with hypothesis testing (when appropriate). Besides emphasis
on multivariable modeling, the text includes a Chapter 17 introducing sur-
vival analysis and methods for analyzing various types of single and multiple
events. This book does not provide examples of analyses of one common
type of response variable, namely, cost and related measures of resource con-
sumption. However, least squares modeling presented in Chapter 15.1, the
robust rank-based methods presented in Chapters 13, 15, and 20, and the
transform-both-sides regression models discussed in Chapter 16 are very ap-
plicable and robust for modeling economic outcomes. See [167] and [260] for
example analyses of such dependent variables using, respectively, the Cox
model and nonparametric additive regression. The central Web site for this
book (see the Appendix) has much more material on the use of the Cox model
for analyzing costs.
This text does not address some important study design issues that if not
respected can doom a predictive modeling or estimation project to failure.
See Laupacis, Sekar, and Stiell [378] for a list of some of these issues.
Heavy use is made of the S language used by R. R is the focus because
it is an elegant object-oriented system in which it is easy to implement new
statistical ideas. Many R users around the world have done so, and their work
has benefited many of the procedures described here. R also has a uniform
syntax for specifying statistical models (with respect to categorical predictors,
interactions, etc.), no matter which type of model is being fitted [96].
The free, open-source statistical software system R has been adopted by
analysts and research statisticians worldwide. Its capabilities are growing
exponentially because of the involvement of an ever-growing community of
statisticians who are adding new tools to the base R system through con-
tributed packages. All of the functions used in this text are available in R.
See the book’s Web site for updated information about software availability.
Readers who don’t use R or any other statistical software environment will
still find the statistical methods and case studies in this text useful, and it is
hoped that the code that is presented will make the statistical methods more
Preface xi

concrete. At the very least, the code demonstrates that all of the methods
presented in the text are feasible.
This text does not teach analysts how to use R. For that, the reader may
wish to see reading recommendations on www.r-project.org as well as Venables
and Ripley [635] (which is also an excellent companion to this text) and the
many other excellent texts on R. See the Appendix for more information.
In addition to powerful features that are built into R, this text uses a
package of freely available R functions called rms written by the author. rms
tracks modeling details related to the expanded X or design matrix. It is a
series of over 200 functions for model fitting, testing, estimation, validation,
graphics, prediction, and typesetting by storing enhanced model design at-
tributes in the fit. rms includes functions for least squares and penalized least
squares multiple regression modeling in addition to functions for binary and
ordinal regression, generalized least squares for analyzing serial data, quan-
tile regression, and survival analysis that are emphasized in this text. Other
freely available miscellaneous R functions used in the text are found in the
Hmisc package also written by the author. Functions in Hmisc include facilities
for data reduction, imputation, power and sample size calculation, advanced
table making, recoding variables, importing and inspecting data, and general
graphics. Consult the Appendix for information on obtaining Hmisc and rms.
The author and his colleagues have written SAS macros for fitting re-
stricted cubic splines and for other basic operations. See the Appendix for
more information. It is unfair not to mention some excellent capabilities of
other statistical packages such as Stata (which has also been extended to
provide regression splines and other modeling tools), but the extendability
and graphics of R makes it especially attractive for all aspects of the compre-
hensive modeling strategy presented in this book.
Portions of Chapters 4 and 20 were published as reference [269]. Some of
Chapter 13 was published as reference [272].
The author may be contacted by electronic mail at f.harrell@
vanderbilt.edu and would appreciate being informed of unclear points, er-
rors, and omissions in this book. Suggestions for improvements and for future
topics are also welcome. As described in the Web site, instructors may con-
tact the author to obtain copies of quizzes and extra assignments (both with
answers) related to much of the material in the earlier chapters, and to obtain
full solutions (with graphical output) to the majority of assignments in the
text.
Major changes since the first edition include the following:
1. Creation of a now mature R package, rms, that replaces and greatly ex-
tends the Design library used in the first edition
2. Conversion of all of the book’s code to R
3. Conversion of the book source into knitr [677] reproducible documents
4. All code from the text is executable and is on the web site
5. Use of color graphics and use of the ggplot2 graphics package [667]
6. Scanned images were re-drawn
xii Preface

7. New text about problems with dichotomization of continuous variables

and with classification (as opposed to prediction)
8. Expanded material on multiple imputation and predictive mean match-
ing and emphasis on multiple imputation (using the Hmisc aregImpute
function) instead of single imputation
9. Addition of redundancy analysis
10. Added a new section in Chapter 5 on bootstrap confidence intervals for
rankings of predictors
11. Replacement of the U.S. presidential election data with analyses of a new
diabetes dataset from NHANES using ordinal and quantile regression
12. More emphasis on semiparametric ordinal regression models for contin-
uous Y , as direct competitors of ordinary multiple regression, with a
detailed case study
13. A new chapter on generalized least squares for analysis of serial response
data
14. The case study in imputation and data reduction was completely reworked
and now focuses only on data reduction, with the addition of sparse prin-
cipal components
15. More information about indexes of predictive accuracy
16. Augmentation of the chapter on maximum likelihood to include more
flexible ways of testing contrasts as well as new methods for obtaining
simultaneous confidence intervals
17. Binary logistic regression case study 1 was completely re-worked, now
providing examples of model selection and model approximation accuracy
18. Single imputation was dropped from binary logistic case study 2
19. The case study in transform-both-sides regression modeling has been re-
worked using simulated data where true transformations are known, and
a new example of the smearing estimator was added
20. Addition of 225 references, most of them published 2001–2014
21. New guidance on minimum sample sizes needed by some of the models
22. De-emphasis of bootstrap bumping [610] for obtaining simultaneous con-
fidence regions, in favor of a general multiplicity approach [307].

Acknowledgments

A good deal of the writing of the ﬁrst edition of this book was done during
my 17 years on the faculty of Duke University. I wish to thank my close col-
league Kerry Lee for providing many valuable ideas, fruitful collaborations,
and well-organized lecture notes from which I have greatly beneﬁted over the
past years. Terry Therneau of Mayo Clinic has given me many of his wonderful
ideas for many years, and has written state-of-the-art R software for survival
analysis that forms the core of survival analysis software in my rms package.
Michael Symons of the Department of Biostatistics of the University of North
Preface xiii

Carolina at Chapel Hill and Timothy Morgan of the Division of Public Health
Sciences at Wake Forest University School of Medicine also provided course
materials, some of which motivated portions of this text. My former clini-
cal colleagues in the Cardiology Division at Duke University, Robert Califf,
Phillip Harris, Mark Hlatky, Dan Mark, David Pryor, and Robert Rosati,
for many years provided valuable motivation, feedback, and ideas through
our interaction on clinical problems. Besides Kerry Lee, statistical colleagues
L. Richard Smith, Lawrence Muhlbaier, and Elizabeth DeLong clarified my
thinking and gave me new ideas on numerous occasions. Charlotte Nelson
and Carlos Alzola frequently helped me debug S routines when they thought
they were just analyzing data.
Former students Bercedis Peterson, James Herndon, Robert McMahon,
and Yuan-Li Shen have provided many insights into logistic and survival mod-
eling. Associations with Doug Wagner and William Knaus of the University
of Virginia, Ken Offord of Mayo Clinic, David Naftel of the University of Al-
abama in Birmingham, Phil Miller of Washington University, and Phil Good-
man of the University of Nevada Reno have provided many valuable ideas and
motivations for this work, as have Michael Schemper of Vienna University,
Janez Stare of Ljubljana University, Slovenia, Ewout Steyerberg of Erasmus
University, Rotterdam, Karel Moons of Utrecht University, and Drew Levy of
Genentech. Richard Goldstein, along with several anonymous reviewers, pro-
vided many helpful criticisms of a previous version of this manuscript that
resulted in significant improvements, and critical reading by Bob Edson (VA
Cooperative Studies Program, Palo Alto) resulted in many error corrections.
Thanks to Brian Ripley of the University of Oxford for providing many help-
ful software tools and statistical insights that greatly aided in the production
of this book, and to Bill Venables of CSIRO Australia for wisdom, both sta-
tistical and otherwise. This work would also not have been possible without
the S environment developed by Rick Becker, John Chambers, Allan Wilks,
and the R language developed by Ross Ihaka and Robert Gentleman.
Work for the second edition was done in the excellent academic environ-
ment of Vanderbilt University, where biostatistical and biomedical colleagues
and graduate students provided new insights and stimulating discussions.
Thanks to Nick Cox, Durham University, UK, who provided from his careful
reading of the first edition a very large number of improvements and correc-
tions that were incorporated into the second. Four anonymous reviewers of
the second edition also made numerous suggestions that improved the text.

Nashville, TN, USA Frank E. Harrell, Jr.

July 2015
Contents

Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Hypothesis Testing, Estimation, and Prediction . . . . . . . . . . . 1
1.2 Examples of Uses of Predictive Multivariable Modeling . . . . . 3
1.3 Prediction vs. Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Planning for Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Emphasizing Continuous Variables . . . . . . . . . . . . . . . 8
1.5 Choice of the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 General Aspects of Fitting Regression Models . . . . . . . . . . . . 13

2.1 Notation for Multivariable Regression Models . . . . . . . . . . . . . 13
2.2 Model Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Interpreting Model Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 Nominal Predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.3 Example: Inference for a Simple Model . . . . . . . . . . . . 17
2.4 Relaxing Linearity Assumption for Continuous Predictors . . 18
2.4.1 Avoiding Categorization . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Simple Nonlinear Terms . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.3 Splines for Estimating Shape of Regression
Function and Determining Predictor
Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.4 Cubic Spline Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.5 Restricted Cubic Splines . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.6 Choosing Number and Position of Knots . . . . . . . . . . 26
2.4.7 Nonparametric Regression . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.8 Advantages of Regression Splines over
Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

xv
xvi Contents

2.5 Recursive Partitioning: Tree-Based Models . . . . . . . . . . . . . . . . 30

2.6 Multiple Degree of Freedom Tests of Association . . . . . . . . . . 31
2.7 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.1 Regression Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.2 Modeling and Testing Complex Interactions . . . . . . . 36
2.7.3 Fitting Ordinal Predictors . . . . . . . . . . . . . . . . . . . . . . . 38
2.7.4 Distributional Assumptions . . . . . . . . . . . . . . . . . . . . . . 39
2.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 Missing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 Types of Missing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Prelude to Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Missing Values for Diﬀerent Types of Response Variables . . . 47
3.4 Problems with Simple Alternatives to Imputation . . . . . . . . . 47
3.5 Strategies for Developing an Imputation Model . . . . . . . . . . . . 49
3.6 Single Conditional Mean Imputation . . . . . . . . . . . . . . . . . . . . . 52
3.7 Predictive Mean Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.8 Multiple Imputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.8.1 The aregImpute and Other Chained Equations
Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.9 Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.10 Summary and Rough Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.11 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Multivariable Modeling Strategies . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1 Prespecification of Predictor Complexity Without
Later Simplification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Checking Assumptions of Multiple Predictors
Simultaneously . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Variable Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4 Sample Size, Overfitting, and Limits on Number
of Predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Shrinkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.6 Collinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.7 Data Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.7.1 Redundancy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7.2 Variable Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7.3 Transformation and Scaling Variables Without
Using Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7.4 Simultaneous Transformation and Imputation . . . . . . 83
4.7.5 Simple Scoring of Variable Clusters . . . . . . . . . . . . . . . 85
4.7.6 Simplifying Cluster Scores . . . . . . . . . . . . . . . . . . . . . . . 87
4.7.7 How Much Data Reduction Is Necessary? . . . . . . . . . 87
Contents xvii

4.8 Other Approaches to Predictive Modeling . . . . . . . . . . . . . . . . 89

4.9 Overly Inﬂuential Observations . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.10 Comparing Two Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.11 Improving the Practice of Multivariable Prediction . . . . . . . . 94
4.12 Summary: Possible Modeling Strategies . . . . . . . . . . . . . . . . . . 94
4.12.1 Developing Predictive Models . . . . . . . . . . . . . . . . . . . . 95
4.12.2 Developing Models for Eﬀect Estimation . . . . . . . . . . 98
4.12.3 Developing Models for Hypothesis Testing . . . . . . . . . 99
4.13 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5 Describing, Resampling, Validating, and Simplifying

the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1 Describing the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.1 Interpreting Eﬀects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.2 Indexes of Model Performance . . . . . . . . . . . . . . . . . . . 104
5.2 The Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3 Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3.2 Which Quantities Should Be Used in Validation? . . . 110
5.3.3 Data-Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.3.4 Improvements on Data-Splitting: Resampling . . . . . . 112
5.3.5 Validation Using the Bootstrap . . . . . . . . . . . . . . . . . . 114
5.4 Bootstrapping Ranks of Predictors . . . . . . . . . . . . . . . . . . . . . . . 117
5.5 Simplifying the Final Model by Approximating It . . . . . . . . . . 118
5.5.1 Diﬃculties Using Full Models . . . . . . . . . . . . . . . . . . . . 118
5.5.2 Approximating the Full Model . . . . . . . . . . . . . . . . . . . 119
5.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.7 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6 R Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.1 The R Modeling Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2 User-Contributed Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.3 The rms Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.4 Other Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

7 Modeling Longitudinal Responses using Generalized

Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.1 Notation and Data Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.2 Model Speciﬁcation for Eﬀects on E(Y ) . . . . . . . . . . . . . . . . . . 144
7.3 Modeling Within-Subject Dependence . . . . . . . . . . . . . . . . . . . . 144
7.4 Parameter Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . 147
7.5 Common Correlation Structures . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.6 Checking Model Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
xviii Contents

7.7 Sample Size Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

7.8 R Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.9 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.9.1 Graphical Exploration of Data . . . . . . . . . . . . . . . . . . . 150
7.9.2 Using Generalized Least Squares . . . . . . . . . . . . . . . . . 151
7.10 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

8 Case Study in Data Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

8.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.2 How Many Parameters Can Be Estimated? . . . . . . . . . . . . . . . 164
8.3 Redundancy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.4 Variable Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.5 Transformation and Single Imputation Using transcan . . . . . 167
8.6 Data Reduction Using Principal Components . . . . . . . . . . . . . 170
8.6.1 Sparse Principal Components . . . . . . . . . . . . . . . . . . . . 175
8.7 Transformation Using Nonparametric Smoothers . . . . . . . . . . 176
8.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
8.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

9 Overview of Maximum Likelihood Estimation . . . . . . . . . . . . 181

9.1 General Notions—Simple Cases . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.2 Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.2.1 Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.2.2 Wald Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
9.2.3 Score Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
9.2.4 Normal Distribution—One Sample . . . . . . . . . . . . . . . 187
9.3 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.3.1 Global Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.3.2 Testing a Subset of the Parameters . . . . . . . . . . . . . . . 190
9.3.3 Tests Based on Contrasts . . . . . . . . . . . . . . . . . . . . . . . . 192
9.3.4 Which Test Statistics to Use When . . . . . . . . . . . . . . . 193
9.3.5 Example: Binomial—Comparing Two
Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.4 Iterative ML Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.5 Robust Estimation of the Covariance Matrix . . . . . . . . . . . . . . 196
9.6 Wald, Score, and Likelihood-Based Confidence Intervals . . . . 198
9.6.1 Simultaneous Wald Confidence Regions . . . . . . . . . . . 199
9.7 Bootstrap Confidence Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.8 Further Use of the Log Likelihood . . . . . . . . . . . . . . . . . . . . . . . 203
9.8.1 Rating Two Models, Penalizing for Complexity . . . . . 203
9.8.2 Testing Whether One Model Is Better
than Another . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.8.3 Unitless Index of Predictive Ability . . . . . . . . . . . . . . . 205
9.8.4 Unitless Index of Adequacy of a Subset
of Predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.9 Weighted Maximum Likelihood Estimation . . . . . . . . . . . . . . . 208
9.10 Penalized Maximum Likelihood Estimation . . . . . . . . . . . . . . . 209
Contents xix

9.11 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

9.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

10 Binary Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

10.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
10.1.1 Model Assumptions and Interpretation
of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
10.1.2 Odds Ratio, Risk Ratio, and Risk Diﬀerence . . . . . . . 224
10.1.3 Detailed Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
10.1.4 Design Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
10.2.1 Maximum Likelihood Estimates . . . . . . . . . . . . . . . . . . 231
10.2.2 Estimation of Odds Ratios and Probabilities . . . . . . . 232
10.2.3 Minimum Sample Size Requirement . . . . . . . . . . . . . . 233
10.3 Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
10.4 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.5 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
10.6 Collinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10.7 Overly Inﬂuential Observations . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10.8 Quantifying Predictive Ability . . . . . . . . . . . . . . . . . . . . . . . . . . 256
10.9 Validating the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
10.10 Describing the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
10.11 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
10.12 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
10.13 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

11 Binary Logistic Regression Case Study 1 . . . . . . . . . . . . . . . . . 275

11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
11.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
11.3 Data Transformations and Single Imputation . . . . . . . . . . . . . 276
11.4 Regression on Original Variables, Principal Components
and Pretransformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.5 Description of Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
11.6 Backwards Step-Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
11.7 Model Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

12 Logistic Model Case Study 2: Survival of Titanic

Passengers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
12.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
12.2 Exploring Trends with Nonparametric Regression . . . . . . . . . . 294
12.3 Binary Logistic Model With Casewise Deletion
of Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
12.4 Examining Missing Data Patterns . . . . . . . . . . . . . . . . . . . . . . . 302
12.5 Multiple Imputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
12.6 Summarizing the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . 307
xx Contents

13 Ordinal Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

13.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
13.2 Ordinality Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
13.3 Proportional Odds Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
13.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
13.3.2 Assumptions and Interpretation of Parameters . . . . . 313
13.3.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
13.3.4 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
13.3.5 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . 315
13.3.6 Quantifying Predictive Ability . . . . . . . . . . . . . . . . . . . 318
13.3.7 Describing the Fitted Model . . . . . . . . . . . . . . . . . . . . . 318
13.3.8 Validating the Fitted Model . . . . . . . . . . . . . . . . . . . . . 318
13.3.9 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
13.4 Continuation Ratio Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
13.4.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
13.4.2 Assumptions and Interpretation of Parameters . . . . . 320
13.4.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.4.4 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
13.4.5 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . 321
13.4.6 Extended CR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
13.4.7 Role of Penalization in Extended CR Model . . . . . . . 322
13.4.8 Validating the Fitted Model . . . . . . . . . . . . . . . . . . . . . 322
13.4.9 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
13.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
13.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

14 Case Study in Ordinal Regression, Data Reduction,

and Penalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
14.1 Response Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
14.2 Variable Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
14.3 Developing Cluster Summary Scores . . . . . . . . . . . . . . . . . . . . . 330
14.4 Assessing Ordinality of Y for each X, and Unadjusted
Checking of PO and CR Assumptions . . . . . . . . . . . . . . . . . . . . 333
14.5 A Tentative Full Proportional Odds Model . . . . . . . . . . . . . . . 333
14.6 Residual Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
14.7 Graphical Assessment of Fit of CR Model . . . . . . . . . . . . . . . . 338
14.8 Extended Continuation Ratio Model . . . . . . . . . . . . . . . . . . . . . 340
14.9 Penalized Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
14.10 Using Approximations to Simplify the Model . . . . . . . . . . . . . 348
14.11 Validating the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
14.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
14.13 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
14.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Contents xxi

15 Regression Models for Continuous Y and Case Study

in Ordinal Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
15.1 The Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
15.2 Quantile Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
15.3 Ordinal Regression Models for Continuous Y . . . . . . . . . . . . . . 361
15.3.1 Minimum Sample Size Requirement . . . . . . . . . . . . . . 363
15.4 Comparison of Assumptions of Various Models . . . . . . . . . . . . 364
15.5 Dataset and Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . 365
15.5.1 Checking Assumptions of OLS and Other Models . . . 368
15.6 Ordinal Regression Applied to HbA1c . . . . . . . . . . . . . . . . . . . . 370
15.6.1 Checking Fit for Various Models Using Age . . . . . . . . 370
15.6.2 Examination of BMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
15.6.3 Consideration of All Body Size Measurements . . . . . . 375

16 Transform-Both-Sides Regression . . . . . . . . . . . . . . . . . . . . . . . . . 389

16.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
16.2 Generalized Additive Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
16.3 Nonparametric Estimation of Y -Transformation . . . . . . . . . . . 390
16.4 Obtaining Estimates on the Original Scale . . . . . . . . . . . . . . . . 391
16.5 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
16.6 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

17 Introduction to Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 399

17.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
17.2 Censoring, Delayed Entry, and Truncation . . . . . . . . . . . . . . . . 401
17.3 Notation, Survival, and Hazard Functions . . . . . . . . . . . . . . . . 402
17.4 Homogeneous Failure Time Distributions . . . . . . . . . . . . . . . . . 407
17.5 Nonparametric Estimation of S and Λ . . . . . . . . . . . . . . . . . . . 409
17.5.1 Kaplan–Meier Estimator . . . . . . . . . . . . . . . . . . . . . . . . 409
17.5.2 Altschuler–Nelson Estimator . . . . . . . . . . . . . . . . . . . . . 413
17.6 Analysis of Multiple Endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . 413
17.6.1 Competing Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
17.6.2 Competing Dependent Risks . . . . . . . . . . . . . . . . . . . . . 414
17.6.3 State Transitions and Multiple Types of Nonfatal
Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
17.6.4 Joint Analysis of Time and Severity of an Event . . . . 417
17.6.5 Analysis of Multiple Events . . . . . . . . . . . . . . . . . . . . . . 417
17.7 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
17.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
17.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421

18 Parametric Survival Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423

18.1 Homogeneous Models (No Predictors) . . . . . . . . . . . . . . . . . . . . 423
18.1.1 Speciﬁc Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
18.1.2 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
18.1.3 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . 426
xxii Contents

18.2 Parametric Proportional Hazards Models . . . . . . . . . . . . . . . . . 427

18.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
18.2.2 Model Assumptions and Interpretation
of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
18.2.3 Hazard Ratio, Risk Ratio, and Risk Difference . . . . . 430
18.2.4 Specific Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
18.2.5 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
18.2.6 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . 434
18.3 Accelerated Failure Time Models . . . . . . . . . . . . . . . . . . . . . . . . 436
18.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
18.3.2 Model Assumptions and Interpretation
of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
18.3.3 Specific Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
18.3.4 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
18.3.5 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
18.3.6 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . 440
18.3.7 Validating the Fitted Model . . . . . . . . . . . . . . . . . . . . . 446
18.4 Buckley–James Regression Model . . . . . . . . . . . . . . . . . . . . . . . . 447
18.5 Design Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
18.6 Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
18.7 Quantifying Predictive Ability . . . . . . . . . . . . . . . . . . . . . . . . . . 447
18.8 Time-Dependent Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
18.9 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
18.10 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
18.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451

19 Case Study in Parametric Survival Modeling and Model

Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
19.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
19.2 Checking Adequacy of Log-Normal Accelerated Failure
Time Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
19.3 Summarizing the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . 466
19.4 Internal Validation of the Fitted Model Using
the Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
19.5 Approximating the Full Model . . . . . . . . . . . . . . . . . . . . . . . . . . 469
19.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473

20 Cox Proportional Hazards Regression Model . . . . . . . . . . . . . 475

20.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
20.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
20.1.2 Model Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
20.1.3 Estimation of β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
20.1.4 Model Assumptions and Interpretation
of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
20.1.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Contents xxiii

20.1.6 Design Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

20.1.7 Extending the Model by Stratiﬁcation . . . . . . . . . . . . 481
20.2 Estimation of Survival Probability and Secondary
Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
20.3 Sample Size Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
20.4 Test Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
20.5 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
20.6 Assessment of Model Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
20.6.1 Regression Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 487
20.6.2 Proportional Hazards Assumption . . . . . . . . . . . . . . . . 494
20.7 What to Do When PH Fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
20.8 Collinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
20.9 Overly Inﬂuential Observations . . . . . . . . . . . . . . . . . . . . . . . . . . 504
20.10 Quantifying Predictive Ability . . . . . . . . . . . . . . . . . . . . . . . . . . 504
20.11 Validating the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
20.11.1 Validation of Model Calibration . . . . . . . . . . . . . . . . . . 506
20.11.2 Validation of Discrimination and Other Statistical
Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
20.12 Describing the Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
20.13 R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
20.14 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

21 Case Study in Cox Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

21.1 Choosing the Number of Parameters and Fitting
the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
21.2 Checking Proportional Hazards . . . . . . . . . . . . . . . . . . . . . . . . . 525
21.3 Testing Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
21.4 Describing Predictor Eﬀects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
21.5 Validating the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
21.6 Presenting the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
21.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531

A Datasets, R Packages, and Internet Resources . . . . . . . . . . . . . 535

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571

Sample - Solution Manual For Principles of Communications 7th Edition by Ziemer & Tranter-1
50% (2)
Sample - Solution Manual For Principles of Communications 7th Edition by Ziemer & Tranter-1
15 pages
Food Allergies: A Complete Guide For Eating When Your Life Depends On It (A Johns Hopkins Press Health Book) - ISBN 9781421423388, 978-1421423388
100% (25)
Food Allergies: A Complete Guide For Eating When Your Life Depends On It (A Johns Hopkins Press Health Book) - ISBN 9781421423388, 978-1421423388
23 pages
SolidCAM 2015 HSR-HSM Machining User Guide
No ratings yet
SolidCAM 2015 HSR-HSM Machining User Guide
279 pages
Enterprise Agility in Healthcare: Candid Case Studies of Successful Organizational Transformations., 978-0367138172
100% (29)
Enterprise Agility in Healthcare: Candid Case Studies of Successful Organizational Transformations., 978-0367138172
23 pages
Red Book 2018: Report of The Committee On Infectious Diseases. Thirty-First Edition. ISBN 9781610021463, 978-1610021463
100% (19)
Red Book 2018: Report of The Committee On Infectious Diseases. Thirty-First Edition. ISBN 9781610021463, 978-1610021463
23 pages
Safety of Silicone Breast Implants.
100% (16)
Safety of Silicone Breast Implants.
23 pages
Ending Medical Reversal: Improving Outcomes, Saving Lives (Johns Hopkins Press Health Books (Paperback) ) - ISBN 1421429047, 978-1421429045
100% (28)
Ending Medical Reversal: Improving Outcomes, Saving Lives (Johns Hopkins Press Health Books (Paperback) ) - ISBN 1421429047, 978-1421429045
23 pages
Occupational Exposure Assessment For Air Contaminants. ISBN 1566706092, 978-1566706094
100% (30)
Occupational Exposure Assessment For Air Contaminants. ISBN 1566706092, 978-1566706094
23 pages
Challenges and Solutions in Patient-Centered Care: A Case Book (Patient-Centered Care Series) - , 978-1857759860
100% (26)
Challenges and Solutions in Patient-Centered Care: A Case Book (Patient-Centered Care Series) - , 978-1857759860
23 pages
The New Public Health. 3rd Edition. ISBN 0124157661, 978-0124157668
100% (18)
The New Public Health. 3rd Edition. ISBN 0124157661, 978-0124157668
23 pages
Biometry: The Principles and Practices of Statistics in Biological Research. ISBN 0716724111, 978-0716724117
100% (23)
Biometry: The Principles and Practices of Statistics in Biological Research. ISBN 0716724111, 978-0716724117
23 pages
Compliance For Coding, Billing & Reimbursement: A Systematic Approach To Developing A Comprehensive Program. ISBN 1563273683, 978-1563273681
100% (26)
Compliance For Coding, Billing & Reimbursement: A Systematic Approach To Developing A Comprehensive Program. ISBN 1563273683, 978-1563273681
23 pages
Final ScribdValue-based Radiology: A Practical Approach (Medical Radiology) - , 978-3030315542
100% (30)
Final ScribdValue-based Radiology: A Practical Approach (Medical Radiology) - , 978-3030315542
23 pages
State of The Heart: Exploring The History, Science, and Future of Cardiac Disease. ISBN 1250169704, 978-1250169709
100% (20)
State of The Heart: Exploring The History, Science, and Future of Cardiac Disease. ISBN 1250169704, 978-1250169709
23 pages
Nuclear Cardiology Review: A Self-Assessment Tool. 2nd Edition. ISBN 9781496326928, 978-1496326928
100% (24)
Nuclear Cardiology Review: A Self-Assessment Tool. 2nd Edition. ISBN 9781496326928, 978-1496326928
23 pages
Mad Science: Psychiatric Coercion, Diagnosis, and Drugs. 1st Edition. ISBN 1412855926, 978-1412855921
100% (26)
Mad Science: Psychiatric Coercion, Diagnosis, and Drugs. 1st Edition. ISBN 1412855926, 978-1412855921
23 pages
Private Guns, Public Health, New Ed..
100% (23)
Private Guns, Public Health, New Ed..
23 pages
The Relevance of Social Science For Medicine (Culture, Illness and Healing, 1) - ISBN 9027711852, 978-9027711854
100% (29)
The Relevance of Social Science For Medicine (Culture, Illness and Healing, 1) - ISBN 9027711852, 978-9027711854
23 pages
Healthcare Transformation: A Guide For The Hospital Board Member., 978-1439805060
100% (23)
Healthcare Transformation: A Guide For The Hospital Board Member., 978-1439805060
23 pages
Practical Statistics For Nursing P. ISBN 0471497169, 978-0471497165
100% (25)
Practical Statistics For Nursing P. ISBN 0471497169, 978-0471497165
23 pages
Valuing Health For Regulatory Cost-Effectiveness Analysis. ISBN 0309100771, 978-0309100779
100% (25)
Valuing Health For Regulatory Cost-Effectiveness Analysis. ISBN 0309100771, 978-0309100779
23 pages
Selling Teaching Hospitals and Practice Plans: George Washington and Georgetown Universities. ISBN 0801888115, 978-0801888113
100% (20)
Selling Teaching Hospitals and Practice Plans: George Washington and Georgetown Universities. ISBN 0801888115, 978-0801888113
23 pages
Risk Stratification: A Practical Guide For Clinicians. ISBN 0521669456, 978-0521669450
100% (19)
Risk Stratification: A Practical Guide For Clinicians. ISBN 0521669456, 978-0521669450
23 pages
A Factory of One. ISBN 1439859930, 978-1439859933
100% (29)
A Factory of One. ISBN 1439859930, 978-1439859933
23 pages
Final ScribdStrategic Thinking in A Hospital Setting (SpringerBriefs in Public Health) - , 978-3319535968
100% (20)
Final ScribdStrategic Thinking in A Hospital Setting (SpringerBriefs in Public Health) - , 978-3319535968
23 pages
Curriculum Development For Medical Education: A Six-Step Approach. ISBN 1421418525, 978-1421418520
100% (21)
Curriculum Development For Medical Education: A Six-Step Approach. ISBN 1421418525, 978-1421418520
23 pages
Silent Victories: The History and Practice of Public Health in Twentieth-Century America. ISBN 0195150694, 978-0195150698
100% (22)
Silent Victories: The History and Practice of Public Health in Twentieth-Century America. ISBN 0195150694, 978-0195150698
23 pages
The Great American Drug Deal: A New Prescription For Innovative and Affordable Medicines. ISBN 1733058915, 978-1733058919
100% (21)
The Great American Drug Deal: A New Prescription For Innovative and Affordable Medicines. ISBN 1733058915, 978-1733058919
23 pages
Healthy Eating To Reduce The Risk of Dementia.
100% (20)
Healthy Eating To Reduce The Risk of Dementia.
23 pages
M - Health Current and Future Applications (EAI/Springer Innovations in Communication and Computing) - , 978-3030021818
100% (26)
M - Health Current and Future Applications (EAI/Springer Innovations in Communication and Computing) - , 978-3030021818
23 pages
Improving Diagnosis in Health Care (Quality Chasm) - ISBN 0309377692, 978-0309377690
100% (27)
Improving Diagnosis in Health Care (Quality Chasm) - ISBN 0309377692, 978-0309377690
23 pages
Statistical Modeling For Biomedical Researchers: A Simple Introduction To The Analysis of Complex Data (Cambridge Medicine (Paperback) )
100% (27)
Statistical Modeling For Biomedical Researchers: A Simple Introduction To The Analysis of Complex Data (Cambridge Medicine (Paperback) )
23 pages
Chickenizing Farms and Food: How Industrial Meat Production Endangers Workers, Animals, and Consumers.
100% (22)
Chickenizing Farms and Food: How Industrial Meat Production Endangers Workers, Animals, and Consumers.
23 pages
Infectious Diseases A Clinical Short Course 3/E (In Thirty Days Series) - ISBN 9780071789257, 978-0071789257
100% (25)
Infectious Diseases A Clinical Short Course 3/E (In Thirty Days Series) - ISBN 9780071789257, 978-0071789257
23 pages
Cardiovascular Health Care Economics (Contemporary Cardiology) - ISBN 9780896038745, 978-0896038745
100% (23)
Cardiovascular Health Care Economics (Contemporary Cardiology) - ISBN 9780896038745, 978-0896038745
23 pages
Critical Care Examination and Board Review. ISBN 1259834352, 978-1259834356
100% (31)
Critical Care Examination and Board Review. ISBN 1259834352, 978-1259834356
23 pages
Evidence-Based Management in Healthcare. ISBN 1567933068, 978-1567933062
100% (27)
Evidence-Based Management in Healthcare. ISBN 1567933068, 978-1567933062
23 pages
Process Improvement With Electronic Health Records. ISBN 1439872333, 978-1439872338
100% (21)
Process Improvement With Electronic Health Records. ISBN 1439872333, 978-1439872338
23 pages
Climate Change and Public Health. ISBN 0190202459, 978-0190202453
100% (23)
Climate Change and Public Health. ISBN 0190202459, 978-0190202453
23 pages
Manager To Coach: The New Way To Get Results (UK Professional Business Management / Business) - ISBN 0077140184, 978-0077140182
100% (15)
Manager To Coach: The New Way To Get Results (UK Professional Business Management / Business) - ISBN 0077140184, 978-0077140182
23 pages
Functional Neuroanatomy: Text and Atlas, 2nd Edition: Text and Atlas (LANGE Basic Science) - 2nd Edition. ISBN 9780071408127, 978-0071408127
100% (20)
Functional Neuroanatomy: Text and Atlas, 2nd Edition: Text and Atlas (LANGE Basic Science) - 2nd Edition. ISBN 9780071408127, 978-0071408127
23 pages
Crisis Call For New Preventive Medicine, A: Emerging Effects of Lifestyle On Morbidity and Mortality. Isbn 9812387005, 978-9812387004
100% (28)
Crisis Call For New Preventive Medicine, A: Emerging Effects of Lifestyle On Morbidity and Mortality. Isbn 9812387005, 978-9812387004
23 pages
Care at The Close of Life: Evidence and Experience (Jama Evidence) - 1st Edition. ISBN 0071637958, 978-0071637954
100% (23)
Care at The Close of Life: Evidence and Experience (Jama Evidence) - 1st Edition. ISBN 0071637958, 978-0071637954
23 pages
Principles of Ambulatory Medicine (Principles of Ambulatory Medicine (Barker) ) - Seventh Edition. ISBN 9780781762274, 978-0781762274
100% (18)
Principles of Ambulatory Medicine (Principles of Ambulatory Medicine (Barker) ) - Seventh Edition. ISBN 9780781762274, 978-0781762274
23 pages
The Social Determinants of Mental Health. ISBN 1585624772, 978-1585624775
100% (27)
The Social Determinants of Mental Health. ISBN 1585624772, 978-1585624775
23 pages
Mad Cow Crisis: Health and The Public Good., 978-1857288124
100% (24)
Mad Cow Crisis: Health and The Public Good., 978-1857288124
23 pages
A Century of Eugenics in America: From The Indiana Experiment To The Human Genome Era (Bioethics and The Humanities)
100% (26)
A Century of Eugenics in America: From The Indiana Experiment To The Human Genome Era (Bioethics and The Humanities)
23 pages
Anatomy of Writing For Publication For Nurses. ISBN 1945157216, 978-1945157219
100% (19)
Anatomy of Writing For Publication For Nurses. ISBN 1945157216, 978-1945157219
23 pages
COVID-19 and World Order: The Future of Conflict, Competition, and Cooperation. ISBN 1421440733, 978-1421440736
100% (21)
COVID-19 and World Order: The Future of Conflict, Competition, and Cooperation. ISBN 1421440733, 978-1421440736
23 pages
Leveraging Data in Healthcare: Best Practices For Controlling, Analyzing, and Using Data (HIMSS Book Series) - ISBN 1938904842, 978-1938904844
100% (26)
Leveraging Data in Healthcare: Best Practices For Controlling, Analyzing, and Using Data (HIMSS Book Series) - ISBN 1938904842, 978-1938904844
23 pages
AACN Essentials of Critical Care Nursing, Second Edition. 2nd Edition. ISBN 0071664424, 978-0071664424
100% (24)
AACN Essentials of Critical Care Nursing, Second Edition. 2nd Edition. ISBN 0071664424, 978-0071664424
23 pages
Unplugged: Reclaiming Our Right To Die in America. ISBN 0814408826, 978-0814408827
100% (28)
Unplugged: Reclaiming Our Right To Die in America. ISBN 0814408826, 978-0814408827
23 pages
Illuminating Disease: An Introduction To Green Fluorescent Proteins. ISBN 0199362815, 978-0199362813
100% (23)
Illuminating Disease: An Introduction To Green Fluorescent Proteins. ISBN 0199362815, 978-0199362813
23 pages
Turning The World Upside Down: The Search For Global Health in The 21st Century. ISBN 1853159336, 978-1853159336
100% (20)
Turning The World Upside Down: The Search For Global Health in The 21st Century. ISBN 1853159336, 978-1853159336
23 pages
AIDS in The Twenty-First Century: Disease and Globalization by Tony Barnett (2002-09-06) .
100% (23)
AIDS in The Twenty-First Century: Disease and Globalization by Tony Barnett (2002-09-06) .
23 pages
The Power of Ideas To Transform Healthcare. ISBN 1498707408, 978-1498707404
100% (20)
The Power of Ideas To Transform Healthcare. ISBN 1498707408, 978-1498707404
23 pages
Final ScribdNo Family History: The Environmental Links To Breast Cancer (New Social Formations) - , 978-0742564084
100% (20)
Final ScribdNo Family History: The Environmental Links To Breast Cancer (New Social Formations) - , 978-0742564084
23 pages
Teaching Students in Clinical Settings (Therapy in Practice Series) - ISBN 0412452502, 978-0412452505
100% (29)
Teaching Students in Clinical Settings (Therapy in Practice Series) - ISBN 0412452502, 978-0412452505
23 pages
Haiti After The Earthquake., 978-1610390989
100% (19)
Haiti After The Earthquake., 978-1610390989
23 pages
Healthcare Workforce Transitioning: Competency Conversations Through World Café., 978-0367024031
100% (29)
Healthcare Workforce Transitioning: Competency Conversations Through World Café., 978-0367024031
23 pages
Regression Modeling Strategies With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis - 2nd Edition Illustrated eBook Download
100% (2)
Regression Modeling Strategies With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis - 2nd Edition Illustrated eBook Download
17 pages
Regression Modeling PDF
100% (1)
Regression Modeling PDF
598 pages
Case Study On Any One Layer of OSI Model
No ratings yet
Case Study On Any One Layer of OSI Model
8 pages
Activador para Office
No ratings yet
Activador para Office
2 pages
DBE-04499Deng - Display
No ratings yet
DBE-04499Deng - Display
163 pages
Energy-Mate APP User Manual-20230628
No ratings yet
Energy-Mate APP User Manual-20230628
23 pages
ASCII Code - The Extended ASCII Table
No ratings yet
ASCII Code - The Extended ASCII Table
9 pages
Hh-Ard-3P Elevator Automatic Rescue Device: Xi'An Uplift Parts Co.,Ltd
100% (1)
Hh-Ard-3P Elevator Automatic Rescue Device: Xi'An Uplift Parts Co.,Ltd
3 pages
Chapter 10: Implementing File and Print Services: Windows Platform I CH 10
No ratings yet
Chapter 10: Implementing File and Print Services: Windows Platform I CH 10
19 pages
SLP-TX420/TX423: User's Manual
No ratings yet
SLP-TX420/TX423: User's Manual
40 pages
Greedalgorithm
No ratings yet
Greedalgorithm
17 pages
SSC CGL Exam Pattern and Syllabus 2023
No ratings yet
SSC CGL Exam Pattern and Syllabus 2023
31 pages
Maxtar 200 La-Zz
No ratings yet
Maxtar 200 La-Zz
142 pages
Ex 14 - Sequential and Indexed File Allocation
No ratings yet
Ex 14 - Sequential and Indexed File Allocation
10 pages
067_THE_COACH_NDA_CHAKRA_BOOK_SOLUTION_QUESTIONS_FILE_Binary
No ratings yet
067_THE_COACH_NDA_CHAKRA_BOOK_SOLUTION_QUESTIONS_FILE_Binary
3 pages
Serial-ATA RAID Card
No ratings yet
Serial-ATA RAID Card
31 pages
2020092903481550v12a WZ5012L
No ratings yet
2020092903481550v12a WZ5012L
14 pages
Model Concour
No ratings yet
Model Concour
96 pages
Grillo Search Warrant
No ratings yet
Grillo Search Warrant
42 pages
Math 5-Q4-Module-5
No ratings yet
Math 5-Q4-Module-5
21 pages
Ericsson
100% (1)
Ericsson
5 pages
International Catalogue 2019
No ratings yet
International Catalogue 2019
47 pages
Routing and Switching Essentials Subject Outline 2024 Autumn
No ratings yet
Routing and Switching Essentials Subject Outline 2024 Autumn
11 pages
Introduction To Digital Systems 9 - Standard Combinational Modules
No ratings yet
Introduction To Digital Systems 9 - Standard Combinational Modules
62 pages
Pdf2Gerb 1.6
No ratings yet
Pdf2Gerb 1.6
11 pages
Stella Maris College (Autonomous) Chennai Department of Commerce - Shift Ii End Semester Examination - November 2021 Date Time Course Code
No ratings yet
Stella Maris College (Autonomous) Chennai Department of Commerce - Shift Ii End Semester Examination - November 2021 Date Time Course Code
1 page
IRITM Manak Nagar Lucknow - Google Search
No ratings yet
IRITM Manak Nagar Lucknow - Google Search
1 page
Job Description - Active Directory Engineer
No ratings yet
Job Description - Active Directory Engineer
2 pages
7.security Kernels
No ratings yet
7.security Kernels
4 pages
SWE 320 Object Oriented Programming (OOP) : College of Technological Innovations (Cti)
No ratings yet
SWE 320 Object Oriented Programming (OOP) : College of Technological Innovations (Cti)
21 pages

Regression Modeling Strategies: With Applications To Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer Series in Statistics) - ISBN 3319194240, 978-3319194240

Uploaded by

Regression Modeling Strategies: With Applications To Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer Series in Statistics) - ISBN 3319194240, 978-3319194240

Uploaded by

Regression Modeling Strategies: With Applications to Linear

Models, Logistic and Ordinal Regression, and Survival

ISSN 0172-7397 ISSN 2197-568X (electronic)

Library of Congress Control Number: 2015942921

Springer Cham Heidelberg New York Dordrecht London

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www.

• Satisfaction of model assumptions improves precision and increases statis-

frequently rejected, providing formal evidence that the analyst’s investment

7. New text about problems with dichotomization of continuous variables

Nashville, TN, USA Frank E. Harrell, Jr.

Typographical Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv

2 General Aspects of Fitting Regression Models . . . . . . . . . . . . 13

2.5 Recursive Partitioning: Tree-Based Models . . . . . . . . . . . . . . . . 30

4 Multivariable Modeling Strategies . . . . . . . . . . . . . . . . . . . . . . . . 63

4.8 Other Approaches to Predictive Modeling . . . . . . . . . . . . . . . . 89

5 Describing, Resampling, Validating, and Simplifying

7 Modeling Longitudinal Responses using Generalized

7.7 Sample Size Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8 Case Study in Data Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

9 Overview of Maximum Likelihood Estimation . . . . . . . . . . . . 181

9.11 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

10 Binary Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

11 Binary Logistic Regression Case Study 1 . . . . . . . . . . . . . . . . . 275

12 Logistic Model Case Study 2: Survival of Titanic

13 Ordinal Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

14 Case Study in Ordinal Regression, Data Reduction,

15 Regression Models for Continuous Y and Case Study

16 Transform-Both-Sides Regression . . . . . . . . . . . . . . . . . . . . . . . . . 389

17 Introduction to Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 399

18 Parametric Survival Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423

18.2 Parametric Proportional Hazards Models . . . . . . . . . . . . . . . . . 427

19 Case Study in Parametric Survival Modeling and Model

20 Cox Proportional Hazards Regression Model . . . . . . . . . . . . . 475

20.1.6 Design Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

21 Case Study in Cox Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

A Datasets, R Packages, and Internet Resources . . . . . . . . . . . . . 535

You might also like