100% found this document useful (2 votes)
127 views

Download Using R and RStudio for Data Management Statistical Analysis and Graphics 2nd Edition Nicholas J. Horton ebook All Chapters PDF

Data

Uploaded by

lugarkoeppsb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
127 views

Download Using R and RStudio for Data Management Statistical Analysis and Graphics 2nd Edition Nicholas J. Horton ebook All Chapters PDF

Data

Uploaded by

lugarkoeppsb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Download the Full Version of textbook for Fast Typing at textbookfull.

com

Using R and RStudio for Data Management


Statistical Analysis and Graphics 2nd Edition
Nicholas J. Horton

https://textbookfull.com/product/using-r-and-rstudio-for-
data-management-statistical-analysis-and-graphics-2nd-
edition-nicholas-j-horton/

OR CLICK BUTTON

DOWNLOAD NOW

Download More textbook Instantly Today - Get Yours Now at textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

R Cookbook Proven Recipes for Data Analysis Statistics and


Graphics Jd Long

https://textbookfull.com/product/r-cookbook-proven-recipes-for-data-
analysis-statistics-and-graphics-jd-long/

textboxfull.com

R in Action Data Analysis and Graphics with R Bonus ch 23


ONLY 2nd Edition Robert Kabacoff

https://textbookfull.com/product/r-in-action-data-analysis-and-
graphics-with-r-bonus-ch-23-only-2nd-edition-robert-kabacoff/

textboxfull.com

Statistical Data Analysis using SAS Intermediate


Statistical Methods Mervyn G. Marasinghe

https://textbookfull.com/product/statistical-data-analysis-using-sas-
intermediate-statistical-methods-mervyn-g-marasinghe/

textboxfull.com

Visualizing Data in R 4 Graphics Using the base graphics


stats and ggplot2 Packages 1st Edition Margot Tollefson

https://textbookfull.com/product/visualizing-data-in-r-4-graphics-
using-the-base-graphics-stats-and-ggplot2-packages-1st-edition-margot-
tollefson/
textboxfull.com
R Graphics Cookbook Practical Recipes for Visualizing Data
2nd Edition Winston Chang

https://textbookfull.com/product/r-graphics-cookbook-practical-
recipes-for-visualizing-data-2nd-edition-winston-chang/

textboxfull.com

Metaprogramming in R: Advanced Statistical Programming for


Data Science, Analysis and Finance 1st Edition Thomas
Mailund
https://textbookfull.com/product/metaprogramming-in-r-advanced-
statistical-programming-for-data-science-analysis-and-finance-1st-
edition-thomas-mailund/
textboxfull.com

An Introduction to Statistical Methods and Data Analysis


7th Edition R. Lyman Ott

https://textbookfull.com/product/an-introduction-to-statistical-
methods-and-data-analysis-7th-edition-r-lyman-ott/

textboxfull.com

Statistical and Thermal Physics An Introduction 2nd


Edition Michael J R Hoch

https://textbookfull.com/product/statistical-and-thermal-physics-an-
introduction-2nd-edition-michael-j-r-hoch/

textboxfull.com

Functional Programming in R: Advanced Statistical


Programming for Data Science, Analysis and Finance 1st
Edition Thomas Mailund
https://textbookfull.com/product/functional-programming-in-r-advanced-
statistical-programming-for-data-science-analysis-and-finance-1st-
edition-thomas-mailund/
textboxfull.com
Statistics
Second
Edition
Incorporating the latest R packages as well as new case studies and applica-
tions, Using R and RStudio for Data Management, Statistical Analysis, and

Statistical Analysis, and Graphics


Using R and RStudio for Data Management,
Graphics, Second Edition covers the aspects of R most often used by statisti-
cal analysts. New users of R will find the book’s simple approach easy to under-
stand while more sophisticated users will appreciate the invaluable source of
task-oriented information.
New to the Second Edition
• The use of RStudio, which increases the productivity of R users and helps
users avoid error-prone cut-and-paste workflows
• New chapter of case studies illustrating examples of useful data
management tasks, reading complex files, making and annotating maps,
“scraping” data from the web, mining text files, and generating dynamic
graphics
• New chapter on special topics that describes key features, such as
processing by group, and explores important areas of statistics, including
Bayesian methods, propensity scores, and bootstrapping
• New chapter on simulation that includes examples of data generated from
complex models and distributions
• A detailed discussion of the philosophy and use of the knitr and markdown
packages for R
• New packages that extend the functionality of R and facilitate sophisticated
analyses
• Reorganized and enhanced chapters on data input and output, data
management, statistical and mathematical functions, programming, high-
level graphics plots, and the customization of plots
Conveniently organized by short, clear descriptive entries, this edition continues
to show users how to easily perform an analytical task in R. Users can quickly

Horton and Kleinman


find and implement the material they need through the extensive indexing, cross-
referencing, and worked examples in the text. Datasets and code are available
for download on a supplementary website.

K23166
Nicholas J. Horton and Ken Kleinman

w w w. c rc p r e s s . c o m

K23166_cover.indd 1 2/3/15 12:39 PM


i i

“K23166” — 2015/1/28 — 9:35 — page 2 — #2


i i

Using R and
RStudio
for Data Management,
Statistical Analysis,
and Graphics
Second Edition

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page 3 — #3


i i

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page 4 — #4


i i

R and
Using
RStudio
for Data Management,
Statistical Analysis,
and Graphics
Second Edition

Nicholas J. Horton
Department of Mathematics and Statistics
Amherst College
Massachusetts, U.S.A.

Ken Kleinman
Department of Population Medicine
Harvard Medical School and
Harvard Pilgrim Health Care Institute
Boston, Massachusetts, U.S.A.

i i

i i
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2015 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works


Version Date: 20150126

International Standard Book Number-13: 978-1-4822-3737-5 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-
ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-
lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-
ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
i i

“K23166” — 2015/1/28 — 9:35 — page v — #7


i i

Contents

List of Tables xvii

List of Figures xix

Preface to the second edition xxi

Preface to the first edition xxiii

1 Data input and output 1


1.1 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Native dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Fixed format text files . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.3 Other fixed files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.4 Comma-separated value (CSV) files . . . . . . . . . . . . . . . . . . 2
1.1.5 Read sheets from an Excel file . . . . . . . . . . . . . . . . . . . . . 2
1.1.6 Read data from R into SAS . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.7 Read data from SAS into R . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.8 Reading datasets in other formats . . . . . . . . . . . . . . . . . . . 3
1.1.9 Reading more complex text files . . . . . . . . . . . . . . . . . . . . 3
1.1.10 Reading data with a variable number of words in a field . . . . . . . 4
1.1.11 Read a file byte by byte . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.12 Access data from a URL . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.13 Read an XML-formatted file . . . . . . . . . . . . . . . . . . . . . . 6
1.1.14 Read an HTML table . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.15 Manual data entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Displaying data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Number of digits to display . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Save a native dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.4 Creating datasets in text format . . . . . . . . . . . . . . . . . . . . 8
1.2.5 Creating Excel spreadsheets . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.6 Creating files for use by other packages . . . . . . . . . . . . . . . . 8
1.2.7 Creating HTML formatted output . . . . . . . . . . . . . . . . . . . 8
1.2.8 Creating XML datasets and output . . . . . . . . . . . . . . . . . . . 9
1.3 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

v
i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page vi — #8


i i

vi CONTENTS

2 Data management 11
2.1 Structure and metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Access variables from a dataset . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Names of variables and their types . . . . . . . . . . . . . . . . . . . 11
2.1.3 Values of variables in a dataset . . . . . . . . . . . . . . . . . . . . . 12
2.1.4 Label variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.5 Add comment to a dataset or variable . . . . . . . . . . . . . . . . . 12
2.2 Derived variables and data manipulation . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Add derived variable to a dataset . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Rename variables in a dataset . . . . . . . . . . . . . . . . . . . . . . 13
2.2.3 Create string variables from numeric variables . . . . . . . . . . . . . 13
2.2.4 Create categorical variables from continuous variables . . . . . . . . 13
2.2.5 Recode a categorical variable . . . . . . . . . . . . . . . . . . . . . . 14
2.2.6 Create a categorical variable using logic . . . . . . . . . . . . . . . . 14
2.2.7 Create numeric variables from string variables . . . . . . . . . . . . . 15
2.2.8 Extract characters from string variables . . . . . . . . . . . . . . . . 15
2.2.9 Length of string variables . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.10 Concatenate string variables . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.11 Set operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.12 Find strings within string variables . . . . . . . . . . . . . . . . . . . 16
2.2.13 Find approximate strings . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.14 Replace strings within string variables . . . . . . . . . . . . . . . . . 17
2.2.15 Split strings into multiple strings . . . . . . . . . . . . . . . . . . . . 17
2.2.16 Remove spaces around string variables . . . . . . . . . . . . . . . . . 17
2.2.17 Convert strings from upper to lower case . . . . . . . . . . . . . . . 17
2.2.18 Create lagged variable . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.19 Formatting values of variables . . . . . . . . . . . . . . . . . . . . . . 18
2.2.20 Perl interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.21 Accessing databases using SQL . . . . . . . . . . . . . . . . . . . . . 18
2.3 Merging, combining, and subsetting datasets . . . . . . . . . . . . . . . . . 19
2.3.1 Subsetting observations . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 Drop or keep variables in a dataset . . . . . . . . . . . . . . . . . . . 19
2.3.3 Random sample of a dataset . . . . . . . . . . . . . . . . . . . . . . 20
2.3.4 Observation number . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.5 Keep unique values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.6 Identify duplicated values . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.7 Convert from wide to long (tall) format . . . . . . . . . . . . . . . . 21
2.3.8 Convert from long (tall) to wide format . . . . . . . . . . . . . . . . 21
2.3.9 Concatenate and stack datasets . . . . . . . . . . . . . . . . . . . . . 22
2.3.10 Sort datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.11 Merge datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Date and time variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.1 Create date variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Extract weekday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.3 Extract month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.4 Extract year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.5 Extract quarter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.6 Create time variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.6.1 Data input and output . . . . . . . . . . . . . . . . . . . . . . . . . . 25

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page vii — #9


i i

CONTENTS vii

2.6.2 Data display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27


2.6.3 Derived variables and data manipulation . . . . . . . . . . . . . . . . 27
2.6.4 Sorting and subsetting datasets . . . . . . . . . . . . . . . . . . . . . 31

3 Statistical and mathematical functions 33


3.1 Probability distributions and random number generation . . . . . . . . . . . 33
3.1.1 Probability density function . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 Quantiles of a probability density function . . . . . . . . . . . . . . . 33
3.1.3 Setting the random number seed . . . . . . . . . . . . . . . . . . . . 34
3.1.4 Uniform random variables . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.5 Multinomial random variables . . . . . . . . . . . . . . . . . . . . . . 35
3.1.6 Normal random variables . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.7 Multivariate normal random variables . . . . . . . . . . . . . . . . . 35
3.1.8 Truncated multivariate normal random variables . . . . . . . . . . . 36
3.1.9 Exponential random variables . . . . . . . . . . . . . . . . . . . . . . 36
3.1.10 Other random variables . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Mathematical functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1 Basic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2 Trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.3 Special functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.4 Integer functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.5 Comparisons of floating-point variables . . . . . . . . . . . . . . . . 38
3.2.6 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.7 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.8 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.9 Optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.1 Create matrix from vector . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.2 Combine vectors or matrices . . . . . . . . . . . . . . . . . . . . . . 39
3.3.3 Matrix addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.4 Transpose matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.5 Find the dimension of a matrix or dataset . . . . . . . . . . . . . . . 40
3.3.6 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.7 Finding the inverse of a matrix . . . . . . . . . . . . . . . . . . . . . 40
3.3.8 Component-wise multiplication . . . . . . . . . . . . . . . . . . . . . 40
3.3.9 Create a submatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.10 Create a diagonal matrix . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.11 Create a vector of diagonal elements . . . . . . . . . . . . . . . . . . 41
3.3.12 Create a vector from a matrix . . . . . . . . . . . . . . . . . . . . . . 41
3.3.13 Calculate the determinant . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.14 Find eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . 41
3.3.15 Find the singular value decomposition . . . . . . . . . . . . . . . . . 41
3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Programming and operating system interface 45


4.1 Control flow, programming, and data generation . . . . . . . . . . . . . . . 45
4.1.1 Looping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.2 Conditional execution . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.3 Sequence of values or patterns . . . . . . . . . . . . . . . . . . . . . 46
4.1.4 Perform an action repeatedly over a set of variables . . . . . . . . . 46

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page viii — #10


i i

viii CONTENTS

4.1.5 Grid of values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47


4.1.6 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.7 Error recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Interactions with the operating system . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Timing commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Suspend execution for a time interval . . . . . . . . . . . . . . . . . 49
4.3.3 Execute a command in the operating system . . . . . . . . . . . . . 49
4.3.4 Command history . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.5 Find working directory . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.6 Change working directory . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.7 List and access files . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.8 Create temporary file . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.9 Redirect output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Common statistical procedures 51


5.1 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1.1 Means and other summary statistics . . . . . . . . . . . . . . . . . . 51
5.1.2 Weighted means and other statistics . . . . . . . . . . . . . . . . . . 51
5.1.3 Other moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.4 Trimmed mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.5 Quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.6 Centering, normalizing, and scaling . . . . . . . . . . . . . . . . . . . 52
5.1.7 Mean and 95% confidence interval . . . . . . . . . . . . . . . . . . . 52
5.1.8 Proportion and 95% confidence interval . . . . . . . . . . . . . . . . 53
5.1.9 Maximum likelihood estimation of parameters . . . . . . . . . . . . . 53
5.2 Bivariate statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.1 Epidemiologic statistics . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.2 Test characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.3 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.4 Kappa (agreement) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 Contingency tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.1 Display cross-classification table . . . . . . . . . . . . . . . . . . . . 55
5.3.2 Displaying missing value categories in a table . . . . . . . . . . . . . 55
5.3.3 Pearson chi-square statistic . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.4 Cochran–Mantel–Haenszel test . . . . . . . . . . . . . . . . . . . . . 55
5.3.5 Cramér’s V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.6 Fisher’s exact test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.7 McNemar’s test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Tests for continuous variables . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4.1 Tests for normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4.2 Student’s t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4.3 Test for equal variances . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4.4 Nonparametric tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4.5 Permutation test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4.6 Logrank test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.5 Analytic power and sample size calculations . . . . . . . . . . . . . . . . . . 58
5.6 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.7.1 Summary statistics and exploratory data analysis . . . . . . . . . . . 59
5.7.2 Bivariate relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 60

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page ix — #11


i i

CONTENTS ix

5.7.3 Contingency tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61


5.7.4 Two sample tests of continuous variables . . . . . . . . . . . . . . . 64
5.7.5 Survival analysis: logrank test . . . . . . . . . . . . . . . . . . . . . 65

6 Linear regression and ANOVA 67


6.1 Model fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.1.1 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.1.2 Linear regression with categorical covariates . . . . . . . . . . . . . . 68
6.1.3 Changing the reference category . . . . . . . . . . . . . . . . . . . . 68
6.1.4 Parameterization of categorical covariates . . . . . . . . . . . . . . . 68
6.1.5 Linear regression with no intercept . . . . . . . . . . . . . . . . . . . 69
6.1.6 Linear regression with interactions . . . . . . . . . . . . . . . . . . . 69
6.1.7 Linear regression with big data . . . . . . . . . . . . . . . . . . . . . 69
6.1.8 One-way analysis of variance . . . . . . . . . . . . . . . . . . . . . . 70
6.1.9 Analysis of variance with two or more factors . . . . . . . . . . . . . 70
6.2 Tests, contrasts, and linear functions of parameters . . . . . . . . . . . . . . 70
6.2.1 Joint null hypotheses: several parameters equal 0 . . . . . . . . . . . 70
6.2.2 Joint null hypotheses: sum of parameters . . . . . . . . . . . . . . . 70
6.2.3 Tests of equality of parameters . . . . . . . . . . . . . . . . . . . . . 70
6.2.4 Multiple comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2.5 Linear combinations of parameters . . . . . . . . . . . . . . . . . . . 71
6.3 Model results and diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3.1 Predicted values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3.2 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3.3 Standardized and Studentized residuals . . . . . . . . . . . . . . . . 72
6.3.4 Leverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3.5 Cook’s distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.3.6 DFFITs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.3.7 Diagnostic plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.3.8 Heteroscedasticity tests . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4 Model parameters and results . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.1 Parameter estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.2 Standardized regression coefficients . . . . . . . . . . . . . . . . . . . 73
6.4.3 Coefficient plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.4.4 Standard errors of parameter estimates . . . . . . . . . . . . . . . . 74
6.4.5 Confidence interval for parameter estimates . . . . . . . . . . . . . . 74
6.4.6 Confidence limits for the mean . . . . . . . . . . . . . . . . . . . . . 74
6.4.7 Prediction limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4.8 R-squared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.4.9 Design and information matrix . . . . . . . . . . . . . . . . . . . . . 75
6.4.10 Covariance matrix of parameter estimates . . . . . . . . . . . . . . . 75
6.4.11 Correlation matrix of parameter estimates . . . . . . . . . . . . . . . 76
6.5 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6.1 Scatterplot with smooth fit . . . . . . . . . . . . . . . . . . . . . . . 76
6.6.2 Linear regression with interaction . . . . . . . . . . . . . . . . . . . . 77
6.6.3 Regression coefficient plot . . . . . . . . . . . . . . . . . . . . . . . . 81
6.6.4 Regression diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.6.5 Fitting a regression model separately for each value of another variable 83
6.6.6 Two-way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.6.7 Multiple comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 87

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page x — #12


i i

x CONTENTS

6.6.8 Contrasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7 Regression generalizations and modeling 91


7.1 Generalized linear models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.1.1 Logistic regression model . . . . . . . . . . . . . . . . . . . . . . . . 91
7.1.2 Conditional logistic regression model . . . . . . . . . . . . . . . . . . 91
7.1.3 Exact logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.1.4 Ordered logistic model . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.1.5 Generalized logistic model . . . . . . . . . . . . . . . . . . . . . . . . 93
7.1.6 Poisson model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.1.7 Negative binomial model . . . . . . . . . . . . . . . . . . . . . . . . 93
7.1.8 Log-linear model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Further generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2.1 Zero-inflated Poisson model . . . . . . . . . . . . . . . . . . . . . . . 93
7.2.2 Zero-inflated negative binomial model . . . . . . . . . . . . . . . . . 94
7.2.3 Generalized additive model . . . . . . . . . . . . . . . . . . . . . . . 94
7.2.4 Nonlinear least squares model . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Robust methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3.1 Quantile regression model . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3.2 Robust regression model . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3.3 Ridge regression model . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.4 Models for correlated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.4.1 Linear models with correlated outcomes . . . . . . . . . . . . . . . . 96
7.4.2 Linear mixed models with random intercepts . . . . . . . . . . . . . 96
7.4.3 Linear mixed models with random slopes . . . . . . . . . . . . . . . 96
7.4.4 More complex random coefficient models . . . . . . . . . . . . . . . . 97
7.4.5 Multilevel models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.4.6 Generalized linear mixed models . . . . . . . . . . . . . . . . . . . . 97
7.4.7 Generalized estimating equations . . . . . . . . . . . . . . . . . . . . 97
7.4.8 MANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.4.9 Time series model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.5 Survival analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.5.1 Proportional hazards (Cox) regression model . . . . . . . . . . . . . 98
7.5.2 Proportional hazards (Cox) model with frailty . . . . . . . . . . . . 99
7.5.3 Nelson–Aalen estimate of cumulative hazard . . . . . . . . . . . . . 99
7.5.4 Testing the proportionality of the Cox model . . . . . . . . . . . . . 99
7.5.5 Cox model with time-varying predictors . . . . . . . . . . . . . . . . 100
7.6 Multivariate statistics and discriminant procedures . . . . . . . . . . . . . . 100
7.6.1 Cronbach’s α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.6.2 Factor analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.6.3 Recursive partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.6.4 Linear discriminant analysis . . . . . . . . . . . . . . . . . . . . . . . 100
7.6.5 Latent class analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.6.6 Hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.7 Complex survey design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.8 Model selection and assessment . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.8.1 Compare two models . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.8.2 Log-likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.8.3 Akaike Information Criterion (AIC) . . . . . . . . . . . . . . . . . . 102
7.8.4 Bayesian Information Criterion (BIC) . . . . . . . . . . . . . . . . . 102
7.8.5 LASSO model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xi — #13


i i

CONTENTS xi

7.8.6 Hosmer–Lemeshow goodness of fit . . . . . . . . . . . . . . . . . . . 103


7.8.7 Goodness of fit for count models . . . . . . . . . . . . . . . . . . . . 103
7.9 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.10 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.10.1 Logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.10.2 Poisson regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.10.3 Zero-inflated Poisson regression . . . . . . . . . . . . . . . . . . . . . 106
7.10.4 Negative binomial regression . . . . . . . . . . . . . . . . . . . . . . 107
7.10.5 Quantile regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.10.6 Ordered logistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.10.7 Generalized logistic model . . . . . . . . . . . . . . . . . . . . . . . . 108
7.10.8 Generalized additive model . . . . . . . . . . . . . . . . . . . . . . . 109
7.10.9 Reshaping a dataset for longitudinal regression . . . . . . . . . . . . 110
7.10.10 Linear model for correlated data . . . . . . . . . . . . . . . . . . . . 112
7.10.11 Linear mixed (random slope) model . . . . . . . . . . . . . . . . . . 113
7.10.12 Generalized estimating equations . . . . . . . . . . . . . . . . . . . . 115
7.10.13 Generalized linear mixed model . . . . . . . . . . . . . . . . . . . . . 116
7.10.14 Cox proportional hazards model . . . . . . . . . . . . . . . . . . . . 117
7.10.15 Cronbach’s α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.10.16 Factor analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.10.17 Recursive partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.10.18 Linear discriminant analysis . . . . . . . . . . . . . . . . . . . . . . . 120
7.10.19 Hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 121

8 A graphical compendium 123


8.1 Univariate plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.1.1 Barplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.1.2 Stem-and-leaf plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.1.3 Dotplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.1.4 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.1.5 Density plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.1.6 Empirical cumulative probability density plot . . . . . . . . . . . . . 125
8.1.7 Boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.1.8 Violin plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.2 Univariate plots by grouping variable . . . . . . . . . . . . . . . . . . . . . . 125
8.2.1 Side-by-side histograms . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.2.2 Side-by-side boxplots . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.2.3 Overlaid density plots . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.4 Bar chart with error bars . . . . . . . . . . . . . . . . . . . . . . . . 126
8.3 Bivariate plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.3.1 Scatterplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8.3.2 Scatterplot with multiple y values . . . . . . . . . . . . . . . . . . . 127
8.3.3 Scatterplot with binning . . . . . . . . . . . . . . . . . . . . . . . . . 128
8.3.4 Transparent overplotting scatterplot . . . . . . . . . . . . . . . . . . 128
8.3.5 Bivariate density plot . . . . . . . . . . . . . . . . . . . . . . . . . . 128
8.3.6 Scatterplot with marginal histograms . . . . . . . . . . . . . . . . . 129
8.4 Multivariate plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.4.1 Matrix of scatterplots . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.4.2 Conditioning plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.4.3 Contour plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.4.4 3-D plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xii — #14


i i

xii CONTENTS

8.5 Special-purpose plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130


8.5.1 Choropleth maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.5.2 Interaction plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.5.3 Plots for categorical data . . . . . . . . . . . . . . . . . . . . . . . . 131
8.5.4 Circular plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.5.5 Plot an arbitrary function . . . . . . . . . . . . . . . . . . . . . . . . 131
8.5.6 Normal quantile–quantile plot . . . . . . . . . . . . . . . . . . . . . . 131
8.5.7 Receiver operating characteristic (ROC) curve . . . . . . . . . . . . 132
8.5.8 Plot confidence intervals for the mean . . . . . . . . . . . . . . . . . 132
8.5.9 Plot prediction limits from a simple linear regression . . . . . . . . . 132
8.5.10 Plot predicted lines for each value of a variable . . . . . . . . . . . . 132
8.5.11 Kaplan–Meier plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.5.12 Hazard function plotting . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.5.13 Mean–difference plots . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.6 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8.7.1 Scatterplot with multiple axes . . . . . . . . . . . . . . . . . . . . . 134
8.7.2 Conditioning plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.7.3 Scatterplot with marginal histograms . . . . . . . . . . . . . . . . . 135
8.7.4 Kaplan–Meier plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.7.5 ROC curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.7.6 Pairs plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.7.7 Visualize correlation matrix . . . . . . . . . . . . . . . . . . . . . . . 141

9 Graphical options and configuration 145


9.1 Adding elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
9.1.1 Arbitrary straight line . . . . . . . . . . . . . . . . . . . . . . . . . . 145
9.1.2 Plot symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
9.1.3 Add points to an existing graphic . . . . . . . . . . . . . . . . . . . . 146
9.1.4 Jitter points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.1.5 Regression line fit to points . . . . . . . . . . . . . . . . . . . . . . . 146
9.1.6 Smoothed line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.1.7 Normal density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.1.8 Marginal rug plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.1.9 Titles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.1.10 Footnotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.1.11 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.1.12 Mathematical symbols . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.1.13 Arrows and shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.1.14 Add grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.1.15 Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9.1.16 Identifying and locating points . . . . . . . . . . . . . . . . . . . . . 148
9.2 Options and parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2.1 Graph size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2.2 Grid of plots per page . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2.3 More general page layouts . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2.4 Fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
9.2.5 Point and text size . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
9.2.6 Box around plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
9.2.7 Size of margins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
9.2.8 Graphical settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xiii — #15


i i

CONTENTS xiii

9.2.9 Axis range and style . . . . . . . . . . . . . . . . . . . . . . . . . . . 151


9.2.10 Axis labels, values, and tick marks . . . . . . . . . . . . . . . . . . . 151
9.2.11 Line styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.2.12 Line widths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.2.13 Colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.2.14 Log scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.2.15 Omit axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.3 Saving graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.3.1 PDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.3.2 Postscript . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.3.3 RTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.3.4 JPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9.3.5 Windows Metafile . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9.3.6 Bitmap image file (BMP) . . . . . . . . . . . . . . . . . . . . . . . . 153
9.3.7 Tagged Image File Format . . . . . . . . . . . . . . . . . . . . . . . . 153
9.3.8 PNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9.3.9 Closing a graphic device . . . . . . . . . . . . . . . . . . . . . . . . . 153

10 Simulation 155
10.1 Generating data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
10.1.1 Generate categorical data . . . . . . . . . . . . . . . . . . . . . . . . 155
10.1.2 Generate data from a logistic regression . . . . . . . . . . . . . . . . 156
10.1.3 Generate data from a generalized linear mixed model . . . . . . . . . 156
10.1.4 Generate correlated binary data . . . . . . . . . . . . . . . . . . . . 157
10.1.5 Generate data from a Cox model . . . . . . . . . . . . . . . . . . . . 158
10.1.6 Sampling from a challenging distribution . . . . . . . . . . . . . . . 159
10.2 Simulation applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
10.2.1 Simulation study of Student’s t-test . . . . . . . . . . . . . . . . . . 161
10.2.2 Diploma (or hat-check) problem . . . . . . . . . . . . . . . . . . . . 162
10.2.3 Monty Hall problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
10.2.4 Censored survival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
10.3 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

11 Special topics 167


11.1 Processing by group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11.1.1 Means by group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11.1.2 Linear models stratified by each value of a grouping variable . . . . 168
11.2 Simulation-based power calculations . . . . . . . . . . . . . . . . . . . . . . 169
11.3 Reproducible analysis and output . . . . . . . . . . . . . . . . . . . . . . . . 171
11.4 Advanced statistical methods . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.4.1 Bayesian methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.4.2 Propensity scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
11.4.3 Bootstrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
11.4.4 Missing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
11.4.5 Finite mixture models with concomitant variables . . . . . . . . . . 185
11.5 Further resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xiv — #16


i i

xiv CONTENTS

12 Case studies 187


12.1 Data management and related tasks . . . . . . . . . . . . . . . . . . . . . . 187
12.1.1 Finding two closest values in a vector . . . . . . . . . . . . . . . . . 187
12.1.2 Tabulate binomial probabilities . . . . . . . . . . . . . . . . . . . . . 188
12.1.3 Calculate and plot a running average . . . . . . . . . . . . . . . . . . 188
12.1.4 Create a Fibonacci sequence . . . . . . . . . . . . . . . . . . . . . . . 189
12.2 Read variable format files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
12.3 Plotting maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
12.3.1 Massachusetts counties, continued . . . . . . . . . . . . . . . . . . . 192
12.3.2 Bike ride plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
12.3.3 Choropleth maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
12.4 Data scraping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12.4.1 Scraping data from HTML files . . . . . . . . . . . . . . . . . . . . . 195
12.4.2 Reading data with two lines per observation . . . . . . . . . . . . . . 196
12.4.3 Plotting time series data . . . . . . . . . . . . . . . . . . . . . . . . . 197
12.4.4 Reading tables from HTML . . . . . . . . . . . . . . . . . . . . . . . 198
12.4.5 URL APIs and truly random numbers . . . . . . . . . . . . . . . . . 199
12.4.6 Reading from a web API . . . . . . . . . . . . . . . . . . . . . . . . 200
12.5 Text mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
12.5.1 Retrieving data from arXiv.org . . . . . . . . . . . . . . . . . . . . . 202
12.5.2 Exploratory text mining . . . . . . . . . . . . . . . . . . . . . . . . . 202
12.6 Interactive visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
12.6.1 Visualization using the grammar of graphics (ggvis) . . . . . . . . . 203
12.6.2 Shiny in Markdown . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
12.6.3 Creating a standalone Shiny app . . . . . . . . . . . . . . . . . . . . 206
12.7 Manipulating bigger datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 207
12.8 Constrained optimization: the knapsack problem . . . . . . . . . . . . . . . 208

A Introduction to R and RStudio 211


A.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
A.1.1 Installation under Windows . . . . . . . . . . . . . . . . . . . . . . . 212
A.1.2 Installation under Mac OS X . . . . . . . . . . . . . . . . . . . . . . 213
A.1.3 RStudio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
A.1.4 Other graphical interfaces . . . . . . . . . . . . . . . . . . . . . . . . 213
A.2 Running R and sample session . . . . . . . . . . . . . . . . . . . . . . . . . 214
A.2.1 Replicating examples from the book and sourcing commands . . . . 215
A.2.2 Batch mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.3 Learning R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.3.1 Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
A.3.2 swirl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
A.4 Fundamental structures and objects . . . . . . . . . . . . . . . . . . . . . . 220
A.4.1 Objects and vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
A.4.2 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
A.4.3 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
A.4.4 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
A.4.5 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.4.6 Dataframes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.4.7 Attributes and classes . . . . . . . . . . . . . . . . . . . . . . . . . . 226
A.4.8 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
A.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
A.5.1 Calling functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xv — #17


i i

CONTENTS xv

A.5.2 The apply family of functions . . . . . . . . . . . . . . . . . . . . . . 227


A.5.3 Pipes and connections between functions . . . . . . . . . . . . . . . 228
A.6 Add-ons: packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
A.6.1 Introduction to packages . . . . . . . . . . . . . . . . . . . . . . . . . 229
A.6.2 Packages and name conflicts . . . . . . . . . . . . . . . . . . . . . . . 230
A.6.3 Maintaining packages . . . . . . . . . . . . . . . . . . . . . . . . . . 231
A.6.4 CRAN task views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
A.6.5 Installed libraries and packages . . . . . . . . . . . . . . . . . . . . . 231
A.6.6 Packages referenced in this book . . . . . . . . . . . . . . . . . . . . 233
A.6.7 Datasets available with R . . . . . . . . . . . . . . . . . . . . . . . . 236
A.7 Support and bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

B The HELP study dataset 237


B.1 Background on the HELP study . . . . . . . . . . . . . . . . . . . . . . . . 237
B.2 Roadmap to analyses of the HELP dataset . . . . . . . . . . . . . . . . . . 237
B.3 Detailed description of the dataset . . . . . . . . . . . . . . . . . . . . . . . 239

C References 243

D Indices 255
D.1 Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
D.2 R index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xvi — #18


i i

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xvii — #19


i i

List of Tables

3.1 Quantiles, probabilities, and pseudo-random number generation: available


distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6.1 Formatted results using the xtable package . . . . . . . . . . . . . . . . . . 80

7.1 Generalized linear model distributions supported . . . . . . . . . . . . . . . 92

11.1 Bayesian modeling functions available within the MCMCpack package . . . . 175

12.1 Weights, volume, and values for the knapsack problem . . . . . . . . . . . . 209

A.1 Interactive courses available within swirl . . . . . . . . . . . . . . . . . . . . 219


A.2 CRAN task views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

B.1 Analyses undertaken using the HELP dataset . . . . . . . . . . . . . . . . . 237


B.2 Annotated description of variables in the HELP dataset . . . . . . . . . . . 239

xvii
i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xviii — #20


i i

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xix — #21


i i

List of Figures

3.1 Comparison of standard normal and t distribution with 1 df . . . . . . . . . 42


3.2 Descriptive plot of the normal distribution . . . . . . . . . . . . . . . . . . . 43

5.1 Density plot of depressive symptom scores (CESD) plus superimposed his-
togram and normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Scatterplot of CESD and MCS for women, with primary substance shown as
the plot symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Graphical display of the table of substance by race/ethnicity . . . . . . . . 63
5.4 Density plot of age by gender . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1 Scatterplot of observed values for age and I1 (plus smoothers by substance)
using base graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2 Scatterplot of observed values for age and I1 (plus smoothers by substance)
using the lattice package . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3 Scatterplot of observed values for age and I1 (plus smoothers by substance)
using the ggplot2 package . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4 Regression coefficient plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.5 Default diagnostics for linear models . . . . . . . . . . . . . . . . . . . . . . 83
6.6 Empirical density of residuals, with superimposed normal density . . . . . . 84
6.7 Interaction plot of CESD as a function of substance group and gender . . . 85
6.8 Boxplot of CESD as a function of substance group and gender . . . . . . . 86
6.9 Pairwise comparisons (using Tukey HSD procedure) . . . . . . . . . . . . . 88
6.10 Pairwise comparisons (using the factorplot function) . . . . . . . . . . . . . 89

7.1 Scatterplots of smoothed association of physical component score (PCS) with


CESD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.2 Side-by-side box plots of CESD by treatment and time . . . . . . . . . . . . 114
7.3 Recursive partitioning tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.4 Graphical display of assignment probabilities or score functions from linear
discriminant analysis by actual homeless status . . . . . . . . . . . . . . . . 122
7.5 Results from hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . . 122

8.1 Plot of InDUC and MCS vs. CESD for female alcohol-involved subjects . . 135
8.2 Association of MCS and CESD, stratified by substance and report of suicidal
thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.3 Lattice settings using the mosaic black-and-white theme . . . . . . . . . . . 137
8.4 Association of MCS and PCS with marginal histograms . . . . . . . . . . . 138
8.5 Kaplan–Meier estimate of time to linkage to primary care by randomization
group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

xix
i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xx — #22


i i

xx LIST OF FIGURES

8.6 Receiver operating characteristic curve for the logistical regression model pre-
dicting suicidal thoughts using the CESD as a measure of depressive symp-
toms (sensitivity = true positive rate; 1-specificity = false positive rate) . . 140
8.7 Pairs plot of variables from the HELP dataset using the lattice package . 141
8.8 Pairs plot of variables from the HELP dataset using the GGally package. . 142
8.9 Visual display of correlations (times 100) . . . . . . . . . . . . . . . . . . . . 143

10.1 Plot of true and simulated distributions . . . . . . . . . . . . . . . . . . . . 161

11.1 Generating a new R Markdown file in RStudio . . . . . . . . . . . . . . . . 172


11.2 Sample Markdown input file . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.3 Formatted output from R Markdown example . . . . . . . . . . . . . . . . . 174

12.1 Running average for Cauchy and t distributions . . . . . . . . . . . . . . . . 190


12.2 Massachusetts counties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
12.3 Bike ride plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
12.4 Choropleth map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12.5 Sales plot by time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
12.6 List of questions tagged with dplyr on the Stackexchange website . . . . . 201
12.7 Interactive graphical display . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
12.8 Shiny within R Markdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
12.9 Display of Shiny document within Markdown . . . . . . . . . . . . . . . . . 206
12.10Number of flights departing Bradley airport on Mondays over time . . . . . 209

A.1 R Windows graphical user interface . . . . . . . . . . . . . . . . . . . . . . . 212


A.2 R Mac OS X graphical user interface . . . . . . . . . . . . . . . . . . . . . . 213
A.3 RStudio graphical user interface . . . . . . . . . . . . . . . . . . . . . . . . . 214
A.4 Sample session in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
A.5 Documentation on the mean() function . . . . . . . . . . . . . . . . . . . . 218
A.6 Display after running RSiteSearch("eta squared anova") . . . . . . . . . 219

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xxi — #23


i i

Preface to the second edition

Software systems such as R evolve rapidly, and so do the approaches and expertise of
statistical analysts.
In 2009, we began a blog in which we explored many new case studies and applications,
ranging from generating a Fibonacci series to fitting finite mixture models with concomitant
variables. We also discussed some additions to R, the RStudio integrated development
environment, and new or improved R packages. The blog now has hundreds of entries and
according to Google Analytics has received hundreds of thousands of visits.
The volume you are holding is a larger format and longer than the first edition, and
much of the new material is adapted from these blog entries, while it also includes other
improvements and additions that have emerged in the last few years.
We have extensively reorganized the material in the book and created three new chap-
ters. The firsts, “Simulation,” includes examples where data are generated from complex
models such as mixed-effects models and survival models, and from distributions using
the Metropolis–Hastings algorithm. We also explore interesting statistics and probability
examples via simulation. The second is “Special topics,” where we describe some key fea-
tures, such as processing by group, and detail several important areas of statistics, including
Bayesian methods, propensity scores, and bootstrapping. The last is “Case studies,” where
we demonstrate examples of useful data management tasks, read complex files, make and
annotate maps, show how to “scrape” data from the web, mine text files, and generate
dynamic graphics.
We also describe RStudio in detail. This powerful and easy-to-use front end adds in-
numerable features to R. In our experience, it dramatically increases the productivity of R
users, and by tightly integrating reproducible analysis tools, helps avoid error-prone “cut
and paste” workflows. Our students and colleagues find RStudio an extremely comfortable
interface.
We used a reproducible analysis system (knitr) to generate the example code and
output in the book. Code extracted from these files is provided on the book website. In
this edition, we provide a detailed discussion of the philosophy and use of these systems. In
particular, we feel that the knitr and markdown packages for R, which are tightly integrated
with RStudio, should become a part of every R user’s toolbox. We can’t imagine working
on a project without them.
The second edition of the book features extensive use of a number of new packages
that extend the functionality of the system. These include dplyr (tools for working with
dataframe-like objects and databases), ggplot2 (implementation of the Grammar of Graph-
ics), ggmap (spatial mapping using ggplot2), ggvis (to build interactive graphical displays),
httr (tools for working with URLs and HTTP), lubridate (date and time manipulations),
markdown (for simplified reproducible analysis), shiny (to build interactive web applica-
tions), swirl (for learning R, in R), tidyr (for data manipulation), and xtable (to cre-
ate publication-quality tables). Overall, these packages facilitate ever more sophisticated
analyses.

xxi
i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xxii — #24


i i

xxii PREFACE TO THE SECOND EDITION

Finally, we’ve reorganized much of the material from the first edition into smaller, more
focused chapters. Readers will now find separate (and enhanced) chapters on data input
and output, data management, statistical and mathematical functions, and programming,
rather than a single chapter on “data management.” Graphics are now discussed in two
chapters: one on high-level types of plots, such as scatterplots and histograms, and another
on customizing the fine details of the plots, such as the number of tick marks and the color
of plot symbols.
We’re immensely gratified by the positive response the first edition elicited, and hope
the current volume will be even more useful to you.

On the web
The book website at http://www.amherst.edu/~nhorton/r2 includes the table of contents,
the indices, the HELP dataset in various formats, example code, a pointer to the blog, and
a list of errata.

Acknowledgments
In addition to those acknowledged in the first edition, we would like to thank J.J. Allaire
and the RStudio developers, Danny Kaplan, Deborah Nolan, Daniel Parel, Randall Pruim,
Romain Francois, and Hadley Wickham, plus the many individuals who have created and
shared R packages. Their contributions to R and RStudio, programming efforts, comments,
and guidance and/or helpful suggestions on drafts of the revision have been extremely
helpful. Above all, we greatly appreciate Sara and Julia as well as Abby, Alana, Kinari,
and Sam, for their patience and support.

Amherst, MA
October 2014

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xxiii — #25


i i

Preface to the first edition

R (R development core team, 2009) is a general purpose statistical software package used
in many fields of research. It is licensed for free, as open-source software. The system is
developed by a large group of people, almost all volunteers. It has a large and growing user
and developer base. Methodologists often release applications for general use in R shortly
after they have been introduced into the literature. While professional customer support is
not provided, there are many resources to help support users.
We have written this book as a reference text for users of R. Our primary goal is to
provide users with an easy way to learn how to perform an analytic task in this system,
without having to navigate through the extensive, idiosyncratic, and sometimes unwieldy
documentation or to sort through the huge number of add-on packages. We include many
common tasks, including data management, descriptive summaries, inferential procedures,
regression analysis, multivariate methods, and the creation of graphics. We also show some
more complex applications. In toto, we hope that the text will facilitate more efficient use
of this powerful system.
We do not attempt to exhaustively detail all possible ways available to accomplish a
given task in each system. Neither do we claim to provide the most elegant solution. We
have tried to provide a simple approach that is easy to understand for a new user, and have
supplied several solutions when it seems likely to be helpful.

Who should use this book

Those with an understanding of statistics at the level of multiple-regression analysis


should find this book helpful. This group includes professional analysts who use statistical
packages almost every day as well as statisticians, epidemiologists, economists, engineers,
physicians, sociologists, and others engaged in research or data analysis. We anticipate that
this tool will be particularly useful for sophisticated users, those with years of experience
in only one system, who need or want to use the other system. However, intermediate-
level analysts should reap the same benefit. In addition, the book will bolster the analytic
abilities of a relatively new user, by providing a concise reference manual and annotated
examples.

Using the book

The book has two indices, in addition to the comprehensive table of contents. These
include: 1) a detailed topic (subject) index in English; 2) an R command index, describing
R syntax.
Extensive example analyses of data from a clinical trial are presented; see Table B.1
(p. 237) for a comprehensive list. These employ a single dataset (from the HELP study),
described in Appendix B. Readers are encouraged to download the dataset and code from
the book website. The examples demonstrate the code in action and facilitate exploration
by the reader.

xxiii
i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page xxiv — #26


i i

xxiv PREFACE TO THE FIRST EDITION

In addition to the HELP examples, a case studies and extended examples chapter uti-
lizes many of the functions, idioms and code samples introduced earlier. These include
explications of analytic and empirical power calculations, missing data methods, propensity
score analysis, sophisticated data manipulation, data gleaning from websites, map making,
simulation studies, and optimization. Entries from earlier chapters are cross-referenced to
help guide the reader.
Where to begin
We do not anticipate that the book will be read cover to cover. Instead, we hope that the
extensive indexing, cross-referencing, and worked examples will make it possible for readers
to directly find and then implement what they need. A new user should begin by reading
the first chapter, which includes a sample session and overview of the system. Experienced
users may find the case studies to be valuable as a source of ideas on problem solving in R.
Acknowledgments
We would like to thank Rob Calver, Kari Budyk, Shashi Kumar, and Sarah Morris for
their support and guidance at Informa CRC/Chapman and Hall. We also thank Ben Cowl-
ing, Stephanie Greenlaw, Tanya Hakim, Albyn Jones, Michael Lavine, Pamela Matheson,
Elizabeth Stuart, Rebbecca Wilson, and Andrew Zieffler for comments, guidance and/or
helpful suggestions on drafts of the manuscript.
Above all we greatly appreciate Julia and Sara as well as Abby, Alana, Kinari, and Sam,
for their patience and support.

Northampton, MA and Amherst, MA


February, 2010

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page 1 — #27


i i

Chapter 1

Data input and output

This chapter reviews data input and output, including reading and writing files in spread-
sheet, ASCII file, native, and foreign formats.

1.1 Input
R provides comprehensive support for data input and output. In this section we address
aspects of these tasks. Datasets are organized in dataframes (A.4.6), or connected series
of rectangular arrays, which can be saved as platform-independent objects. UNIX-style
directory delimiters (forward slash) are allowed on Windows.

1.1.1 Native dataset


Example: 7.10
load(file="dir_location/savedfile") # works on all OS including Windows
or
load(file="dir_location\\savedfile") # Windows only

Note: Forward slash is supported as a directory delimiter on all operating systems; a double
backslash is supported under Windows. The file savedfile is created by save() (see 1.2.3).
Running the command print(load(file="dir location/savedfile")) will display the
objects that are added to the workspace.

1.1.2 Fixed format text files


See 1.1.9 (read more complex fixed files) and 12.2 (read variable format files).
ds = read.table("dir_location\\file.txt", header=TRUE) # Windows only
or
ds = read.table("dir_location/file.txt", header=TRUE) # all OS (including
# Windows)
Note: Forward slash is supported as a directory delimiter on all operating systems; a double
backslash is supported under Windows. If the first row of the file includes the name of the
variables, these entries will be used to create appropriate names (reserved characters such as
‘$’ or ‘[’ are changed to ‘.’) for each of the columns in the dataset. If the first row doesn’t
include the names, the header option can be left off (or set to FALSE), and the variables

1
i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page 2 — #28


i i

2 CHAPTER 1. DATA INPUT AND OUTPUT

will be called V1, V2, . . . Vn. A limit on the number of lines to be read can be specified
through the nrows option. The read.table() function can support reading from a URL
as a filename (see 1.1.12) or browse files interactively using read.table(file.choose())
(see 4.3.7).

1.1.3 Other fixed files


See 1.1.9 (read more complex fixed files) and 12.2 (read variable format files)
Sometimes data arrives in files that are very irregular in shape. For example, there may
be a variable number of fields per line, or some data in the line may describe the remainder
of the line. In such cases, a useful generic approach is to read each line into a single character
variable, then use character variable functions (see 2.2) to extract the contents.
ds = readLines("file.txt")
or
ds = scan("file.txt")

Note: The readLines() function returns a character vector with length equal to the number
of lines read (see file()). A limit on the number of lines to be read can be specified through
the nrows option. The scan() function returns a vector, with entries separated by white
space by default. These functions read by default from standard input (see stdin() and
?connections), but can also read from a file or URL (see 1.1.12). The read.fwf() function
may also be useful for reading fixed-width files.

1.1.4 Comma-separated value (CSV) files


Example: 2.6.1
ds = read.csv("dir_location/file.csv")

Note: The stringsAsFactors option can be set to prevent automatic creation of factors
for categorical variables. A limit on the number of lines to be read can be specified through
the nrows option. The command read.csv(file.choose()) can be used to browse files
interactively (see 4.3.7). The comma-separated file can be given as a URL (see 1.1.12). The
colClasses option can be used to speed up reading large files. Caution is needed when
reading date and time variables (see 2.4).

1.1.5 Read sheets from an Excel file


library(gdata)
ds = read.xls("http://www.amherst.edu/~nhorton/r2/datasets/help.xlsx",
sheet=1)
Note: The sheet number can be provided as a number or a name.

1.1.6 Read data from R into SAS


The R package foreign includes the write.dbf() function; we recommend this as a reliable
format for extracting data from R into a SAS-ready file, though other options are possible.
Then SAS proc import can easily read the DBF file.

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page 3 — #29


i i

1.1. INPUT 3

tosas = data.frame(ds)
library(foreign)
write.dbf(tosas, "dir_location/tosas.dbf")
This can be read into SAS using the following commands:
proc import datafile="dir_location\tosas.dbf"
out=fromr dbms=dbf;
run;

1.1.7 Read data from SAS into R


library(foreign)
ds = read.dbf("dir_location/to_r.dbf")
or
library(sas7bdat)
helpfromSAS = read.sas7bdat("dir_location/help.sas7bdat")
Note: The first set of code assumes SAS has been used to write out a dataset in DBF format.
The second can be used with any SAS formatted dataset; it is based on a reverse-engineering
of the SAS dataset format, which SAS has not made public.

1.1.8 Reading datasets in other formats


Example: 6.6.1
library(foreign)
ds = read.dbf("filename.dbf") # DBase
ds = read.epiinfo("filename.epiinfo") # Epi Info
ds = read.mtp("filename.mtp") # Minitab portable worksheet
ds = read.octave("filename.octave") # Octave
ds = read.ssd("filename.ssd") # SAS version 6
ds = read.xport("filename.xport") # SAS XPORT file
ds = read.spss("filename.sav") # SPSS
ds = read.dta("filename.dta") # Stata
ds = read.systat("filename.sys") # Systat
Note: The foreign package can read Stata, Epi Info, Minitab, Octave, SPSS, and Systat
files (with the caveat that SAS files may be platform dependent). The read.ssd() function
will only work if SAS is installed on the local machine.

1.1.9 Reading more complex text files


See 1.1.2 (read fixed files) and 12.2 (read variable format files).
Text data files often contain data in special formats. One common example is date
variables. As an example below we consider the following data.

1 AGKE 08/03/1999 $10.49


2 SBKE 12/18/2002 $11.00
3 SEKK 10/23/1995 $5.00

i i

i i
i i

“K23166” — 2015/1/28 — 9:35 — page 4 — #30


i i

4 CHAPTER 1. DATA INPUT AND OUTPUT

tmpds = read.table("file_location/filename.dat")
id = tmpds$V1
initials = tmpds$V2
datevar = as.Date(as.character(tmpds$V3), "%m/%d/%Y")
cost = as.numeric(substr(tmpds$V4, 2, 100))
ds = data.frame(id, initials, datevar, cost)
rm(tmpds, id, initials, datevar, cost)

or (for the date)

library(lubridate)
library(dplyr)
tmpds = mutate(tmpds, datevar = mdy(V3))

Note: This task is accomplished by first reading the dataset (with default names from
read.table() denoted V1 through V4). These objects can be manipulated using
as.character() to undo the default coding as factor variables, and coerced to the appropri-
ate data types. For the cost variable, the dollar signs are removed using the substr() func-
tion. Finally, the individual variables are bundled together as a dataframe. The lubridate
package includes functions to make handling date and time values easier; the mdy() function
is one of these.

1.1.10 Reading data with a variable number of words in a field

Reading data in a complex data format will generally require a tailored approach. Here
we give a relatively simple example and outline the key tools useful for reading in data in
complex formats. Suppose we have data as follows:
1 Las Vegas, NV --- 53.3 --- --- 1
2 Sacramento, CA --- 42.3 --- --- 2
3 Miami, FL --- 41.8 --- --- 3
4 Tucson, AZ --- 41.7 --- --- 4
5 Cleveland, OH --- 38.3 --- --- 5
6 Cincinnati, OH 15 36.4 --- --- 6
7 Colorado Springs, CO --- 36.1 --- --- 7
8 Memphis, TN --- 35.3 --- --- 8
8 New Orleans, LA --- 35.3 --- --- 8
10 Mesa, AZ --- 34.7 --- --- 10
11 Baltimore, MD --- 33.2 --- --- 11
12 Philadelphia, PA --- 31.7 --- --- 12
13 Salt Lake City, UT --- 31.9 17 --- 13

The --- means that the value is missing. Note two complexities here. First, fields are
delimited by both spaces and commas, where the latter separates the city from the state.
Second, cities may have names consisting of more than one word.

i i

i i
Exploring the Variety of Random
Documents with Different Content
decretals, the donation of Constantine, and the decretum of Gratian.
The last subject ought to be carefully studied by all who wish to
understand the frightful tyranny of a complicated system of laws,
devised not for the protection of a people, but as instruments for
grinding them to subjection. Then, after an historical outline of the
general growth of the Papal power in the twelfth and thirteenth
centuries, the writers enter upon the peculiarly episcopal and clerical
question, pointing out how marvellously every little change worked in
one direction, invariably tending to throw the rule of the Church into
the power of Rome; and how the growth of new institutions, like the
monastic orders and the Inquisition, gradually withdrew the conduct
of affairs from the Bishops of the Church in general, and
consolidated the Papal influence. For all this, however, unless we
could satisfy ourselves with a mere magnified table of contents the
reader must be referred to the book itself, in which he will find the
interest sustained without flagging to the end.”—Pall Mall
Gazette.

“In France, in Holland, and in Germany, there has already


appeared a multitude of disquisitions on this subject. Among these
several are the acknowledged compositions of men of high standing
in the Roman Catholic world,—men admittedly entitled to speak with
the authority that must attach to established reputation: but not one
of them has hitherto produced a work more likely to create a deep
impression than the anonymous German publication at the head of
this notice. It is not a piece of merely polemical writing, it is a treatise
dealing with a large subject in an impressive though partisan
manner, a treatise grave in tone, solid in matter, and bristling with
forcible and novel illustrations.”—Spectator.

“Rumour will, no doubt, be busy with its conjectures as to the


name which lurks beneath the nom de plume of ‘Janus.’ We do not
intend to offer any contribution towards the elucidation of the mystery
unless it be a contribution to say that the book bears internal
evidence of being the work of a Catholic, and that there are not
many Catholics in Europe who could have written it. Taking it all in
all, it is no exaggerated praise to characterise it as the most
damaging assault on Ultra-montanism that has appeared in modern
times. Its learning is copious and complete, yet so admirably
arranged that it invariably illustrates without overlaying the argument.
The style is clear and simple, and there is no attempt at rhetoric. It is
a piece of cool and masterly dissection, all the more terrible for the
passionless manner in which the author conducts the operation.”—
Times.

LETTERS FROM ROME ON THE COUNCIL. By


Quirinus. Reprinted from the “Allgemeine
Zeitung.” Authorized Translation. Crown 8vo.
12s.

FEMALE CHARACTERS OF HOLY SCRIPTURE. In


a Series of Sermons. By the Rev. Isaac
Williams, B.D., formerly Fellow of Trinity
College, Oxford. New Edition. Crown 8vo. 5s.

THE CHARACTERS OF THE OLD TESTAMENT. In


a Series of Sermons. By the Rev. Isaac
Williams, B.D., formerly Fellow of Trinity
College, Oxford. New Edition. Crown 8vo., 5s.
“This is one of the few volumes of published sermons that we
have been able to read with real pleasure. They are written with a
chastened elegance of language, and pervaded by a spirit of earnest
and simple piety. Mr. Williams is evidently what would be called a
very High Churchman. Occasionally his peculiar Church views are
apparent; but bating a few passages here and there, these sermons
will be read with profit by all ‘who profess and call themselves
Christians.’”—Contemporary Review.
“This is a new edition of a very popular—and deservedly popular
—work on the biography of the Old Testament history. The
characters are ably and profitably analysed, and that by the hand of
a master of style and thought.... The principle of selection has been
that of prominence; and partly, too, that of significance in the
characters so ably delineated. A more masterly analysis of Scriptural
characters we never read, nor any which are more calculated to
impress the mind of the reader with feelings of love for what is good,
and abhorrence for what is evil.”—Rock.

THE HILLFORD CONFIRMATION: A TALE. By


M. C. Phillpotts. 18mo. 1s.

APOSTOLICAL SUCCESSION IN THE CHURCH


OF ENGLAND. By the Rev. Arthur W.
Haddan, B.D., Rector of Barton-on-the-Heath,
and late Fellow of Trinity College, Oxford. 8vo.
12s.
“Mr. Haddan’s estimate of the bearing of his subject, and of its
special importance at the present juncture is characteristic, and will
well repay attention.... Mr. Haddan is strictly argumentative
throughout. He abstains with some strictness from everything which
would divert either his reader or himself from accurate investigation
of his reasoning. But his volume is thoroughly well written, clear and
forcible in style, and fair in tone. It cannot but render valuable service
in placing the claims of the Church in their true light before the
English public.”—Guardian.

“Among the many standard theological works devoted to this


important subject Mr. Haddan’s will hold a high place.”—Standard.

“We should be glad to see the volume widely circulated and


generally read.”—John Bull.
“A weighty and valuable treatise, and we hope that the study of
its sound and well-reasoned pages will do much to fix the
importance, and the full meaning of the doctrine in question, in the
minds of Church people.... We hope that our extracts will lead our
readers to study Mr. Haddan for themselves.”—Literary
Churchman.

“This is not only a very able and carefully written treatise upon
the doctrine of Apostolical Succession, but it is also a calm yet noble
vindication of the validity of the Anglican Orders: it well sustains the
brilliant reputation which Mr. Haddan left behind him at Oxford, and it
supplements his other profound historical researches in
ecclesiastical matters. This book will remain for a long time the
classic work upon English Orders.”—Church Review.

“A very temperate, but a very well reasoned book.”—


Westminster Review.

“Mr. Haddan ably sustains his reputation throughout the work.


His style is clear, his inferences are reasonable, and the publication
is especially well-timed in prospect of the coming Œcumenical
Council.”—Cambridge University Gazette.

A MANUAL FOR THE SICK; with other Devotions.


By Lancelot Andrewes, D.D., sometime Lord
Bishop of Winchester. Edited with a Preface by
H. P. Liddon, M.A. Large type. With Portrait.
24mo. 2s. 6d.

HELP AND COMFORT FOR THE SICK POOR. By


the Author of “Sickness; its Trials and
Blessings.” New Edition. Small 8vo. 1s.
A DEVOTIONAL COMMENTARY ON THE GOSPEL
NARRATIVE. By the Rev. Isaac Williams, B.D.,
formerly Fellow of Trinity College, Oxford. A New
and uniform Edition. In Eight volumes. Crown
8vo. 5s. each.

THOUGHTS ON THE STUDY OF THE


HOLY GOSPELS.

Characteristic Differences in the Four Gospels.

Our Lord’s Manifestations of Himself.

The Rule of Scriptural Interpretation furnished by our


Lord.

Analogies of the Gospel.

Mention of Angels in the Gospels.

Places of our Lord’s Abode and Ministry.

Our Lord’s Mode of Dealing with His Apostles.

Conclusion.

A HARMONY OF THE FOUR


EVANGELISTS.
Our Lord’s Nativity.

Our Lord’s Ministry—Second Year.

Our Lord’s Ministry—Third Year.

The Holy Week.

Our Lord’s Passion.

Our Lord’s Resurrection.

OUR LORD’S NATIVITY.

The Birth at Bethlehem.

The Baptism in Jordan.

The First Passover.

OUR LORD’S MINISTRY.


second year.

The Second Passover.

Christ with the Twelve.

The Twelve sent forth.


OUR LORD’S MINISTRY.
third year.

Teaching in Galilee.

Teaching at Jerusalem.

Last Journey from Galilee to Jerusalem.

THE HOLY WEEK.

The Approach to Jerusalem.

The Teaching in the Temple.

The Discourse on the Mount of Olives.

The Last Supper.

OUR LORD’S PASSION.

The Hour of Darkness.

The Agony.

The Apprehension.

The Condemnation.

The Day of Sorrows.


The Hall of Judgment.

The Crucifixion.

The Sepulture.

OUR LORD’S RESURRECTION.

The Day of Days.

The Grave Visited.

Christ Appearing.

The Going to Emmaus.

The Forty Days.

The Apostles Assembled.

The Lake in Galilee.

The Mountain in Galilee.

The Return from Galilee.


“There is not a better companion to be found for the season than
the beautiful ‘Devotional Commentary on the Gospel Narrative,’ by
the Rev. Isaac Williams.... A rich mine for devotional and theological
study.”—Guardian.
“So infinite are the depths and so innumerable the beauties of
Scripture, and more particularly of the Gospels, that there is some
difficulty in describing the manifold excellences of Williams’ exquisite
Commentary. Deriving its profound appreciation of Scripture from the
writings of the early Fathers, it is only what every student knows
must be true to say that it extracts a whole wealth of meaning from
each sentence, each apparently faint allusion, each word in the
text.”—Church Review.

“Stands absolutely alone to our English literature; there is, we


should say, no chance of being superseded by any better book of its
kind; and its merits are of the very highest order.”—Literary
Churchman.

“It would be difficult to select a more useful present, at a small


cost, than this series would be to a young man on his first entering
into Holy Orders, and many, no doubt, will avail themselves of the
republication of these useful volumes for this purpose. There is an
abundance of sermon material to be drawn from any one of them.”—
Church Times.

“This is, in the truest sense of the word, a ‘Devotional


Commentary’ on the Gospel narrative, opening out everywhere, as it
does, the spiritual beauties and blessedness of the Divine message;
but it is something more than this, it meets difficulties almost by
anticipation, and throws the light of learning over some of the very
darkest passages in the New Testament.”—Rock.

“The author has skilfully compared and blended the narratives of


the different Gospels, so as to give a synoptical view of the history;
and though the commentary is called ‘devotional,’ it is scholarly and
suggestive in other respects. The size of the work, extending, as it
does, over eight volumes, may deter purchasers and readers; but
each volume is complete in itself, and we recommend students to
taste a sample of the author’s quality. Some things they may
question; but the volumes are really a helpful and valuable addition
to our stores.”—Freeman.

“The high and solemn verities of the Saviour’s sufferings and


death are treated with great reverence and ability. The thorough
devoutness which pervades the book commends it to our heart.
There is much to instruct and help the believer in the Christian life,
no matter to what section of the Church he may belong.”—
Watchman.

KEYS TO CHRISTIAN KNOWLEDGE.

A KEY TO THE KNOWLEDGE AND USE OF THE


HOLY BIBLE. By the Rev. J. H. Blunt, M.A.
Small 8vo. 2s. 6d.
“Another of Mr. Blunt’s useful and workmanlike compilations,
which will be most acceptable as a household book, or in schools
and colleges. It is a capital book too for schoolmasters and pupil
teachers.”—Literary Churchman.

“As a popular handbook, setting forth a selection of facts of


which everybody ought to be cognizant, and as an exposition of the
claims of the Bible to be received as of superhuman origin, Mr.
Blunt’s ‘Key’ will be useful.”—Churchman.

“A great deal of useful information is comprised in these pages,


and the book will no doubt be extensively circulated in Church
families.”—Clerical Journal.

“We have much pleasure in recommending a capital handbook


by the learned editor of ‘The Annotated Book of Common Prayer.’”—
Church Times.
“Merits commendation for the lucid and orderly
arrangement in which it presents a considerable
amount of valuable and interesting matter.”—
Record.

A KEY TO THE KNOWLEDGE AND USE OF THE


BOOK OF COMMON PRAYER. By the Rev.
J. H. Blunt, M.A. Small 8vo. 2s. 6d.
“A very valuable and practical manual, full of information, which
is admirably calculated to instruct and interest those for whom it was
evidently specially intended—the laity of the Church of England. It
deserves high commendation.”—Churchman.

“A thoroughly sound and valuable manual.”—Church Times.

“To us it appears that Mr. Blunt has succeeded very well. All
necessary information seems to be included, and the arrangement is
excellent.”—Literary Churchman.

“It is the best short explanation of our offices that we know of,
and would be invaluable for the use of candidates for confirmation in
the higher classes.”—John Bull.

A KEY TO CHRISTIAN DOCTRINE AND


PRACTICE FOUNDED ON THE CHURCH
CATECHISM. By the Rev. John Henry Blunt,
M.A. Small 8vo. 2s. 6d.
“Of cheap and reliable text-books of this nature there has
hitherto been a great want. We are often asked to recommend books
for use in Church Sunday-schools, and we therefore take this
opportunity of saying that we know of none more likely to be of
service both to teachers and scholars than these ‘Keys.’”—
Churchman’s Shilling Magazine.

“This is another of Mr. Blunt’s most useful manuals, with all the
precision of a school book, yet diverging into matters of practical
application so freely as to make it most serviceable, either as a
teacher’s suggestion book, or as an intelligent pupil’s reading
book.”—Literary Churchman.

“Will be very useful for the higher classes in Sunday-schools, or


rather for the fuller instruction of the Sunday-school teachers
themselves, where the parish priest is wise enough to devote a
certain time regularly to their preparation for their voluntary task.”—
Union Review.

“Another of the many useful books on theological and Scriptural


subjects which have been written by the Rev. John Henry Blunt. The
present is entitled ‘A Key to Christian Doctrine and Practice, founded
on the Church Catechism,’ and will take its place as an elementary
text-book upon the Creed in our schools and colleges. The Church
Catechism is clearly and fully explained by the author in this ‘Key’.
Numerous references, Scriptural and otherwise, are scattered about
the book.”—Public Opinion.

A KEY TO THE KNOWLEDGE OF CHURCH


HISTORY. (Ancient.) Edited by John Henry
Blunt, M.A. Small 8vo. 2s. 6d.
“It offers a short and condensed account of the origin, growth,
and condition of the Church in all parts of the world, from a.d. 1
down to the end of the fifteenth century. Mr. Blunt’s first object has
been conciseness, and this has been admirably carried out, and to
students of Church history this feature will readily recommend itself.
As an elementary work ‘A Key’ will be specially valuable, inasmuch
as it points out certain definite lines of thought, by which those who
enjoy the opportunity may be guided in reading the statements of
more elaborate histories. At the same time it is but fair to Mr. Blunt to
remark that, for general readers, the little volume contains everything
that could be consistently expected in a volume of its character.
There are many notes, theological, scriptural, and historical, and the
‘get up’ of the book is specially commendable. As a text-book for the
higher forms of schools the work will be acceptable to numerous
teachers.”—Public Opinion.

“It contains some concise notes on Church History, compressed


into a small compass, and we think it is likely to be useful as a book
of reference.”—John Bull.

“A very terse and reliable collection of the main facts and


incidents connected with Church History.”—Rock.

“It will be excellent, either for school or home use, either as a


reading or as a reference book, on all the main facts and names and
controversies of the first fifteen centuries. It is both well arranged and
well written.”—Literary Churchman.

A KEY TO THE KNOWLEDGE OF CHURCH


HISTORY (Modern). Edited by the Rev. John
Henry Blunt, M.A. Small 8vo. 2s. 6d.

A KEY TO THE NARRATIVE OF THE FOUR


GOSPELS. By John Pilkington Norris, M.A.,
Canon of Bristol, formerly one of Her Majesty’s
Inspectors of Schools. Small 8vo. 2s. 6d.
“This is very much the best book of its kind we have seen. The
only fault is its shortness, which prevents its going into the details
which would support and illustrate its statements, and which in the
process of illustrating them would fix them upon the minds and
memories of its readers. It is, however, a great improvement upon
any book of its kind we know. It bears all the marks of being the
condensed work of a real scholar, and of a divine too. The bulk of the
book is taken up with a ‘Life of Christ’ compiled from the Four
Gospels so as to exhibit its steps and stages and salient points. The
rest of the book consists of independent chapters on special
points.”—Literary Churchman.

“This book is no ordinary compendium, no mere ‘cram-book’;


still less is it an ordinary reading book for schools; but the
schoolmaster, the Sunday-school teacher and the seeker after a
comprehensive knowledge of Divine truth will find it worthy of its
name. Canon Norris writes simply, reverently, without great display of
learning, giving the result of much careful study in a short compass,
and adorning the subject by the tenderness and honesty with which
he treats it.... We hope that this little book will have a very wide
circulation and that it will be studied; and we can promise that those
who take it up will not readily put it down again.”—Record.

“This is a golden little volume. Having often to criticise


unsparingly volumes published by Messrs. Rivington, and bearing
the deep High Church brand, it is the greater satisfaction to be able
to commend this book so emphatically. Its design is exceedingly
modest. Canon Norris writes primarily to help ‘younger students’ in
studying the Gospels. But this unpretending volume is one which all
students may study with advantage. It is an admirable manual for
those who take Bible Classes through the Gospels. Closely sifted in
style, so that all is clear and weighty; full of unostentatious learning,
and pregnant with suggestion; deeply reverent in spirit, and
altogether Evangelical in spirit; Canon Norris’ book supplies a real
want, and ought to be welcomed by all earnest and devout students
of the Holy Gospels.”—London Quarterly Review.

A KEY TO THE ACTS OF THE APOSTLES. By


John Pilkington Norris, M.A. Small 8vo. 2s.
6d.
“It is a remarkably well-written and interesting account of its
subject, ‘The Book of the Acts,’ giving us the narrative of St. Luke
with exactly what we want in the way of connecting links and
illustrations. One most notable and praiseworthy characteristic of the
book is its candour.... The book is one which we can heartily
recommend.”—Spectator.

“Of Canon Norris’s ‘Key to the Narrative of the Four Gospels,’


we wrote in high approval not many months ago. The present is not
less carefully prepared, and is full of the unostentatious results of
sound learning and patient thought.”—London Quarterly Review.

“This little volume is one of a series of ‘Keys’ of a more or less


educational character, which are in the course of publication by
Messrs. Rivington. It gives apparently a very fair and tolerably
exhaustive résumé of the contexts of the Acts, with which it deals,
not chapter by chapter, but consecutively in the order of thought.”—
School Board Chronicle.

“Few books have ever given us more unmixed pleasure than


this. It is faultlessly written, so that it reads as pleasantly and
enticingly as if it had not the least intention of being an ‘educational’
book. It is complete and exhaustive, so far as the narrative and all its
bearings go, so that students may feel that they need not be hunting
up other books to supply the lacunæ. It is the work of a classical
scholar, and it leaves nothing wanting in the way of classical
illustrations, which in the case of the Acts are of special importance.
And, lastly, it is theologically sound.”—Literary Churchman.

“This is a sequel to Canon Norris’s ‘Key to the Gospels,’ which


was published two years ago, and which has become a general
favourite with those who wish to grasp the leading features of the life
and word of Christ. The sketch of the Acts of the Apostles is done in
the same style; there is the same reverent spirit and quiet
enthusiasm running through it, and the same instinct for seizing the
leading points in the narrative.”—Record.

⁂ Other Volumes are in preparation.

RIVINGTON’S DEVOTIONAL
SERIES.
Elegantly printed with red borders. 16mo.
2s. 6d. each.

THOMAS À KEMPIS, OF THE IMITATION OF


CHRIST.

Also a Cheap Edition, without the red


borders, 1s., or in Cover, 6d.
“A very beautiful edition. We commend it to the Clergy as an
excellent gift-book for teachers and other workers.”—Church Times.

“This work is a precious relic of mediæval times, and will


continue to be valued by every section of the Christian Church.”—
Weekly Review.

“A beautifully printed pocket edition of this marvellous production


of a man, who, out of the dark mists of popery, saw so much of
experimental religion. Those who are well grounded in evangelical
truth may use it with profit.”—Record.

“A very cheap and handsome edition.”—Rock.


This new edition is a marvel of cheapness.”—Church Review.

“Beautifully printed, and very cheap editions of this long-used


hand-book of devotion.”—Literary World.

THE RULE AND EXERCISES OF HOLY


LIVING. By Jeremy Taylor, D.D., Bishop
of Down and Connor, and Dromore.

Also a Cheap Edition, without the red


borders, 1s.

THE RULE AND EXERCISES OF HOLY DYING. By


Jeremy Taylor, D.D., Bishop of Down and
Connor, and Dromore.

Also a Cheap Edition, without the red


borders, 1s.

The ‘Holy Living’ and the ‘Holy Dying’ may be had


bound together in One Volume, 5s.; or without
the red borders, 2s. 6d.
“An extremely well-printed and well got up edition, as pretty and
graceful as possible, and yet not too fine for real use. We wish the
devotions of this beautiful book were more commonly used.”—
Literary Churchman.

“We must admit that there is a want of helps to spiritual life


amongst us. Our age is so secular, and in religious movements so
bustling, that it is to be feared the inner life is too often forgotten. Our
public teachers may, we are sure, gain by consulting books which
show how contentedness and self-renunciation may be increased;
and in which the pathology of all human affections is treated with a
fulness not common in our theological class rooms.”—Freeman.

“The publishers have done good service by the production of


these beautiful editions of works, which will never lose their
preciousness to devout Christian spirits. It is not necessary for us to
say a word as to their intrinsic merits; we have only to testify to the
good taste, judgment, and care shown in these editions. They are
extremely beautiful in typography and in the general getting up.”—
English Independent.

“We ought not to conclude our notice of recent devotional books,


without mentioning to our readers the above new, elegant, and
cheap reprint, which we trust will never be out of date or out of
favour in the English branch of the Catholic Church.”—Literary
Churchman.

“These manuals of piety written by the pen of the most beautiful


writer and the most impressive divine of the English Church, need no
commendation from us. They are known to the world, read in all
lands, and translated, we have heard, into fifty different languages.
For two centuries they have fed the faith of thousands upon
thousands of souls, now we trust happy with their God, and perhaps
meditating in Heaven with gratitude on their celestial truths, kindled
in their souls by a writer who was little short of being inspired.”—
Rock.

“These little volumes will be appreciated as presents of


inestimable value.”—Public Opinion.

“Either separate or bound together, may be had these two


standard works of the great divine. A good edition very tastefully
printed and bound.”—Record.
A SHORT AND PLAIN INSTRUCTION FOR THE
BETTER UNDERSTANDING OF THE LORD’S
SUPPER; to which is annexed the Office of the
Holy Communion, with proper Helps and
Directions. By Thomas Wilson, D.D., late Lord
Bishop of Sodor and Man. Complete Edition, in
large type.

Also a Cheap Edition, without the red


borders, 1s., or in Cover, 6d.
“The Messrs. Rivington have published a new and unabridged
edition of that deservedly popular work, Bishop Wilson on the Lord’s
Supper. The edition is here presented in three forms, suited to the
various members of the household.”—Public Opinion.

“We cannot withhold the expression of our admiration of the


style and elegance in which this work is got up.”—Press and St.
James’ Chronicle.

“A departed author being dead yet speaketh in a way which will


never be out of date; Bishop Wilson on the Lord’s Supper, published
by Messrs. Rivington, in bindings to suit all tastes and pockets.”—
Church Review.

“We may here fitly record that Bishop Wilson on the Lord’s
Supper has been issued in a new but unabridged form.”—Daily
Telegraph.

INTRODUCTION TO THE DEVOUT LIFE. From the


French of Saint Francis of Sales, Bishop and
Prince of Geneva. A New Translation.
“A very beautiful edition of S. Francis de Sales’ ‘Devout Life:’ a
prettier little edition for binding, type, and paper, of a very great book
is not often seen.”—Church Review.

“The translation is a good one, and the volume is beautifully got


up. It would serve admirably as a gift book to those who are able to
appreciate so spiritual a writer as St. Francis.”—Church Times.

“It has been the food and hope of countless souls ever since its
first appearance two centuries and a half ago, and it still ranks with
Scupoli’s ‘Combattimento Spirituale,’ and Arvisenet’s ‘Memoriale
Vitæ Sacerdotalis,’ as among the very best works of ascetic
theology. We are glad to commend this careful and convenient
version to our readers.”—Union Review.

“We should be curious to know by how many different hands


‘The Devout Life’ of S. Francis de Sales had been translated into
English. At any rate, its popularity is so great that Messrs. Rivington
have just issued another translation of it. The style is good, and the
volume is of a most convenient size.”—John Bull.

“To readers of religious treatises, this volume will be highly


valued. The ‘Introduction to the Devout Life’ is preceded by a sketch
of the life of the author, and a dedicatory prayer of the author is also
given.”—Public Opinion.

A PRACTICAL TREATISE CONCERNING EVIL


THOUGHTS: wherein their Nature, Origin, and
Effect are distinctly considered and explained,
with many Useful Rules for restraining and
suppressing such Thoughts; suited to the
various conditions of Life, and the several
tempers of Mankind, more especially of
melancholy Persons. By William Chilcot, M.A.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like