100% found this document useful (3 votes)
33 views

Instant download Doing Data Science in R An Introduction for Social Scientists 1st Edition Mark Andrews pdf all chapter

Science

Uploaded by

skinlogarud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
33 views

Instant download Doing Data Science in R An Introduction for Social Scientists 1st Edition Mark Andrews pdf all chapter

Science

Uploaded by

skinlogarud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Download the Full Version of textbook for Fast Typing at textbookfull.

com

Doing Data Science in R An Introduction for Social


Scientists 1st Edition Mark Andrews

https://textbookfull.com/product/doing-data-science-in-r-an-
introduction-for-social-scientists-1st-edition-mark-andrews/

OR CLICK BUTTON

DOWNLOAD NOW

Download More textbook Instantly Today - Get Yours Now at textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Introduction to Data Analysis with R for Forensic


Scientists International Forensic Science and
Investigation 1st Edition Curran
https://textbookfull.com/product/introduction-to-data-analysis-with-r-
for-forensic-scientists-international-forensic-science-and-
investigation-1st-edition-curran/
textboxfull.com

Human-Environment Interactions: An Introduction Mark R.


Welford

https://textbookfull.com/product/human-environment-interactions-an-
introduction-mark-r-welford/

textboxfull.com

Interpretive Social Science: An Anti-Naturalist Approach


Mark Bevir

https://textbookfull.com/product/interpretive-social-science-an-anti-
naturalist-approach-mark-bevir/

textboxfull.com

Turbulence an introduction for scientists and engineers


Davidson

https://textbookfull.com/product/turbulence-an-introduction-for-
scientists-and-engineers-davidson/

textboxfull.com
An Introduction to Psychological Science, Second Canadian
Edition Mark Krause

https://textbookfull.com/product/an-introduction-to-psychological-
science-second-canadian-edition-mark-krause/

textboxfull.com

An Introduction to Psychological Science Third Canadian


Edition Mark Krause

https://textbookfull.com/product/an-introduction-to-psychological-
science-third-canadian-edition-mark-krause/

textboxfull.com

R for Data Science 1st Edition Garrett Grolemund

https://textbookfull.com/product/r-for-data-science-1st-edition-
garrett-grolemund/

textboxfull.com

SAS for R users : a book for budding data scientists First


Edition Ohri

https://textbookfull.com/product/sas-for-r-users-a-book-for-budding-
data-scientists-first-edition-ohri/

textboxfull.com

R Programming for Data Science 1st Edition Roger Peng

https://textbookfull.com/product/r-programming-for-data-science-1st-
edition-roger-peng/

textboxfull.com
DOING DATA SCIENCE IN R
DOING DATA SCIENCE IN R
An Introduction for Social Scientists
Mark Andrews

Los Angeles
London
New Delhi
Singapore
Washington DC
Melbourne
SAGE Publications Ltd
1 Oliver’s Yard
55 City Road
London EC1Y 1SP
SAGE Publications Inc.
2455 Teller Road
Thousand Oaks, California 91320
SAGE Publications India Pvt Ltd
B 1/I 1 Mohan Cooperative Industrial Area
Mathura Road
New Delhi 110 044
SAGE Publications Asia-Pacific Pte Ltd
3 Church Street
#10-04 Samsung Hub
Singapore 049483
© Mark Andrews 2021
Apart from any fair dealing for the purposes of research or private
study, or criticism or review, as permitted under the Copyright,
Designs and Patents Act, 1988, this publication may be reproduced,
stored or transmitted in any form, or by any means, only with the
prior permission in writing of the publishers, or in the case of
reprographic reproduction, in accordance with the terms of licences
issued by the Copyright Licensing Agency. Enquiries concerning
reproduction outside those terms should be sent to the publishers.
Library of Congress Control Number: 2020945072
British Library Cataloguing in Publication data
A catalogue record for this book is available from the British Library
ISBN 978-1-5264-8676-9
ISBN 978-1-5264-8677-6 (pbk)
Editor: Aly Owen
Assistant editor: Lauren Jacobs
Production editor: Ian Antcliff
Copyeditor: QuADS Prepress Pvt Ltd
Proofreader: Neville Hankins
Marketing manager: Ben Griffin-Sherwood
Cover design: Shaun Mercier
Typeset by: C&M Digitals (P) Ltd, Chennai, India
Printed in the UK
At SAGE we take sustainability seriously. Most of our products are
printed in the UK using responsibly sourced papers and boards.
When we print overseas we ensure sustainable papers are used as
measured by the PREPS grading system. We undertake an annual
audit to monitor our sustainability.
CONTENTS
About the Author
Online Resources
1 Data Analysis and Data Science
Part I Fundamentals of Data Analysis and Data Science
2 Introduction to R
3 Data Wrangling
4 Data Visualization
5 Exploratory Data Analysis
6 Programming in R
7 Reproducible Data Analysis
Part II Statistical Modelling
8 Statistical Models and Statistical Inference
9 Normal Linear Models
10 Logistic Regression
11 Generalized Linear Models for Count Data
12 Multilevel Models
13 Nonlinear Regression
14 Structural Equation Modelling
Part III Advanced or Special Topics in Data Analysis
15 High-Performance Computing with R
16 Interactive Web Apps with Shiny
17 Probabilistic Modelling with Stan
References
Index
ABOUT THE AUTHOR
Mark Andrews (PhD)
is Senior Lecturer in the Department of Psychology at Nottingham
Trent University. There, he specializes in teaching statistics and data
science at all levels from undergraduate to PhD level. Currently, he is
the Chair of the British Psychological Society’s Mathematics,
Statistics, and Computing section. Between 2015 and 2018, Dr
Andrews was funded by the UK’s Economic and Social Research
Council (ESRC) to provide advanced training workshops on Bayesian
data analysis to UK-based researchers at PhD level and beyond in
the social sciences. Dr Andrews’ background is in computational
cognitive science, particularly focused Bayesian models of human
cognition. He has a PhD in Cognitive Science from Cornell University,
and was a postdoctoral researcher in the Gatsby Computational
Neuroscience Unit in UCL and also in the Department of Psychology
in UCL.
ONLINE RESOURCES

Lecturers can visit https://study.sagepub.com/andrews to find a range of additional


resources to support teaching and aid your students’ study.
INSTRUCTOR RESOURCES
PowerPoint slides covering key themes and topics from every chapter, which
are available for you to download and tailor in support of your teaching.
An instructor’s manual, providing a guide to using the book in teaching and
resources for teaching, including ideas for student activities and assessments.
Datasets for you to share with your students in class or for assignments, which
will support their mastery of data science techniques.
1 DATA ANALYSIS AND DATA SCIENCE
Introduction 2
What is data science? 3
Why R, not Python? 5
Who is this book for? 6
The style and structure of this book 7
1.1 INTRODUCTION
This book is about statistical data analysis of real-world data using modern tools. It is
aimed at those who are currently engaged in, or planning to be engaged in, analysis
of statistical data of the kind that might arise at or beyond PhD level scientific
research, especially in the social sciences. The data in these fields is complex. There
are many variables and complex relationship between them. Analysing this data
almost always requires data wrangling, exploration, and visualization. Above all, it
involves modelling the data using flexible probabilistic models. These models are then
used to reason and make predictions about the scientific phenomenon being studied.
This book aims to address all of these topics. The term we use for these topics and
their corresponding methods and tools is data science.

Figure 1.1 The data science workflow. Raw data is usually messy and not yet
amenable to analysis of any kind. Data wrangling takes the raw data and transforms
it into a new tidy format. This data is then explored and visualized in an iterative
manner, which may also include some further wrangling. This eventually leads to
probabilistic modelling, which itself involves an iterative process of statistical
inference and model evaluation. Finally, we communicate our results in articles,
presentations, webpages, etc.
As we use the term throughout this book, data science is a set of interrelated
computational or mathematical methods and tools that are used in the general data
analysis workflow that we outline in Figure 1.1. This workflow begins with data in its
nascent and raw form. Raw data is usually impossible or extremely difficult to work
with, even casually or informally. The process of transforming the data so that it
amenable to further analysis is data wrangling, and the resulting data sets are said to
be tidy. This data can then be explored and visualized. We view data exploration and
data visualization as ultimately accomplishing the same thing. One usually involves
quantitative descriptive analysis, while the other involves graphical analysis, but both
aim to discover potentially interesting patterns and behaviours in the data. The
exploratory analysis stage then leads us to posit a tentative probabilistic model of the
data. Put more precisely, it leads us to posit a tentative probabilistic model of the
phenomenon that generated the data. Inevitably, this model involves unknown
variables that must be inferred using statistical inference. This leads to a fitted
model, which may be then evaluated and possibly extended and modified, thus
leading to further inference. Eventually, we communicate our results in reports,
presentations, webpages, etc.
Each of the stages of this data science workflow involves computational and
mathematical concepts and methods. In fact, it is this combination of the
computational and the mathematical or statistical that is a defining feature or key
characteristic of data science as we conceive of it. Without using computers, and thus
performing any stages of the workflow manually in some manner, only practically
trivial types of analysis could be accomplished, and even then the analysis would be
laborious and error prone. By contrast, the more proficient we are with the relevant
computational tools, the more efficient and sophisticated our analyses can be. In this
sense, computing skills, specifically reading and writing code, are integral and vital
parts of modern data analysis. These cannot be generally sidestepped or avoided by
using graphical user interfaces (GUIs) to statistics programs. While programs like
these are sometimes suitable for novices or for casual use, they are profoundly
limited and inefficient in comparison to writing code in a high-level programming
language.
In addition to computing tools, many of the stages of the data science workflow
involve mathematical and statistical concepts and methods. This is especially true of
the statistical modelling stage, which requires a proper understanding of
mathematical and probabilistic models, and related topics such as statistical
inference. Simply being able to perform a statistical analysis computationally,
accompanied by a vague and impressionistic understanding of what the analysis is
doing and why it is doing it, will not generally be sufficient. Without a deeper and
theoretical understanding of probabilistic models, statistical inference, and related
concepts, we will not be able to make principled and informed choices concerning
which models to use for any given problem. Nor would we be able to understand the
meaning of the results of the inference, and we would be limited or mistaken in the
practical and scientific conclusions that we make when we use these models for
explanation or prediction. Moreover, statistical models of the kind that we cover in
this book should not be seen as a list of independent tools in a big toolbox, each one
designed for a different task or application, and each with its own rules and
principles. Rather, more generally, we should view statistical modelling as a
systematic framework, or even a language, for building mathematical models of
scientific phenomena using observed data. While we may talk about normal linear
models, or zero-inflated Poisson models, etc., these are just examples of the infinitely
many models that we can build to model the scientific problem at hand. Being aware
of statistical modelling as a flexible and systematic framework that is based on
pragmatic and theoretical principles allows us to more competently and confidently
perform statistical analysis, and also greatly increases the range and scope of the
analyses that are readily available to us.
1.2 WHAT IS DATA SCIENCE?
Even if we accept the nature and the value of the data analysis workflow that we’ve
just outlined, it is reasonable to ask whether it should properly be called ‘data
science’. Is this not just using a new word, even a buzzword, in place of much more
established terms like ‘statistics’ or ‘statistical data analysis’? We are using the term
‘data science’ rather than ‘statistics’ per se or some variant thereof because data
analysis as we’ve outlined it arguably goes beyond the usual focus of statistics, at
least as it is traditionally understood. Mathematical statistics as a scientific or
mathematical discipline has focused largely on the statistical modelling component of
the programme we outlined above. As we’ve hopefully made clear, this component is
of profound importance, and in fact we would argue that it is the single most
important part and even ultimate goal of data analysis. Nonetheless, in practice, data
wrangling alone occupies far more of our time and effort in any analysis, and
exploration and visualization should be seen as necessary precursors to, and even
continuous with, the statistical modelling itself. Likewise, traditional statistics often
marginalizes the practical matter of computing tools. In statistics textbooks, even
excellent ones, for example, code examples may not be provided for all analyses, and
the code may not be integrated tightly with the coverage of the statistical methods.
In this sense, traditional statistics does not thoroughly deal with all the parts of the
data analysis workflow that we have outlined. This is not a criticism of statistics, but
just a recognition of its particular focus.
This general point about real-world data analysis being more than just the traditional
focus of mathematical statistics was actually made decades ago by Tukey (1962).
There, Tukey, who was one of the most influential statisticians of the twentieth
century and a pioneer of exploratory data analysis and data visualization, preferred
the term data analysis as the general term for what he and other statistical analysts
actually do in practice. For Tukey, inferential statistics and statistical modelling were
necessary and vital, but only as a component of a much bigger and multifaceted
undertaking, which he called ‘data analysis’.
While the general spirit of the argument about the breadth and scope of data
analysis that Tukey (1962) outlined is very much in keeping with the perspective we
follow here, modern data analysis has a character that goes beyond Tukey’s vision,
however broad and comprehensive it was. This is due to the computing revolution.
For example, when Tukey was writing in the early 1960s, the world’s fastest
computers1 were capable of around 1 million calculations per second. Approximately
60 years later in 2020, the world’s fastest computer2 is capable of around 500
quadrillion (5 × 1017) calculations per second, and a typical consumer desktop can
perform hundreds of billions of calculations per second. This revolution has
transformed all aspects of data analysis, and now computing is as vital and integral a
part of data analysis as are mathematics and statistics. It was largely the recognition
of the vital and transformative role of computing that lead Cleveland (2001) to coin
the term ‘data science’. As we use the term, therefore, data science is the blend of
computational and statistical methods applied to all the aspects of data analysis.
1 https://en.wikipedia.org/wiki/Atlas_(computer)
2 https://en.wikipedia.org/wiki/Fugaku_(supercomputer)
Even if we accept that the defining feature of data science is generally the combined
application of computational tools and statistical methods to data analysis, the term
‘data science’ has some popular connotations that are somewhat at odds with the
more general understanding of the term that we are following in this book. In
particular, for some people, data science is all about concepts like predictive analytics,
big data, machine learning, deep learning, data mining of massive unstructured data
sets including natural language corpora. It is seen as largely a branch of computer
science and engineering, and is something that is done in the big tech companies like
Google, Amazon, Facebook, Apple, Netflix and Twitter. It is absolutely true that data
science, particularly as it is practised in industry, heavily avails tools like machine
learning and big data analysis software, and is applied to the analysis of massive
unstructured data sets. As real and important as these activities are, we see them
here as just one application of data science as we generally understand the term.
Also, in this particular application of data science, some topics and issues take
precedence over or dominate others. For example, in these contexts, the software
and hardware problems of being able to analyse data that is on such a large scale
are the major practical issues. Likewise, for some applications, being able to perform
successful predictions using statistical methods is the only goal, and so the assumed
statistical models on which these predictions are based are less important or even
irrelevant (see Breiman, 2001, for a well-known early discussion of these two
different general ‘cultures’ of using statistical methods).
In summary, in this book, we use the term ‘data science’ as the general term for
modern data analysis, which is something that always involves a tight integration of
computational and statistical methods and tools. In this, we are hopefully faithfully
following the broad and general understanding of what real-world data analysis
entails as described by Tukey (1962), albeit with the additional vital feature of
intensive use of computational tools. In some contexts, data science has a more
particular focus on big data, data mining, machine learning and related concepts.
That particular focus is not the focus of this book, and so this book is probably not
ideal for anyone keen to learn more about data science in just this sense of the term.
1.3 WHY R, NOT PYTHON?
We have stated repeatedly that computational methods and tools are vital for doing
data science. In this book, the computing language and environment that we use is
the R Project for Statistical Computing, simply known as R. More specifically, we use
the modern incarnation of R that is based on the so-called tidyverse. In Chapter 2 we
provide a proper introduction to R. Here, we wish to just outline why R is our choice
of language and environment, what the alternatives are, and what this entails in
terms of the our conception of what data science is and how it is practised.
Given our conception of the data science workflow that we outlined in Figure 1.1, R is
an inevitable choice. We believe that R is simply the best option for performing all the
major components that we outline there. For example, for the data wrangling
component, which can be extremely laborious, R packages that are part of the
tidyverse such as readr, dplyr and tidyr, which we cover in Chapter 3, make
data wrangling fast and efficient and even pleasurable. For data visualization, the
ggplot2 package provides us with essentially a high-level and expressive language
for data visualization. For the statistical modelling loop, which we cover in all the
chapters of Part II of this book, R provides a huge treasure trove of packages for
virtually every conceivable type of statistical methods and models. Also, R is the
dominant environment, using packages like rstan and brms, for doing Bayesian
probabilistic modelling using the Stan probabilistic programming language. We cover
Bayesian models throughout the chapters of Part II. For communication, R provides
us with the ability to produce reproducible data analysis reports using RMarkdown,
Knitr and other tools, which we describe in Chapter 7.
Everything we cover in this book could be done using another programming
language, or possibly using some set of different languages. Chief among these
alternatives is Python. Python is close to being the most widely used general-purpose
programming language of any kind. It has been very popular for almost two decades,
and its dominance and popularity have been increasing in recent years. Moreover,
one of Python’s major domains of application is data science, with some arguing that
it should preferred over R for data science generally. For the big data, big tech, data
mining, machine learning sense of data science that we mentioned above, Python
certainly ought to be the dominant choice over R. This is for multiple reasons. First,
Python is now the principal computing language for doing machine learning, deep
learning and related activities. Also, because Python is a general-purpose
programming language and one that is widely used on the back end of web
applications, this makes integrating data science tools with the ‘production’ web
server software much easier and scalable. Likewise, Python is a very powerful and
well-designed general-purpose programming language, which means that it is easier
to write complex highly structured software applications in Python than in a
specialized language like R. This again facilitates the integration of Python data
science tools with production- or enterprise-level software applications. Nonetheless,
for the more general conception of data science that we are following in this book,
Python is more limited than R. For example, for data wrangling of typical rectangular
data structures, Python’s pandas package, as excellent as it is, is not as high level
and expressive as R’s tidyverse-based packages like dplyr and tidyr. This entails
that wrangling data into shape in R can be easier and involve less lower-level
procedural and imperative code than when using Python. Likewise for data
visualization, Python’s matplotlib package is very powerful but is also lower-level
than ggplot2. This entails that relatively complex visualization requires considerably
more procedural or imperative code, which is harder and slower to read and write
than using the more expressive high-level code of ggplot2. The higher-level
counterpart of matplotlib is seaborn, which is excellent, but seaborn is less
powerful and extensive in terms of its features than ggplot. For statistical
modelling, at the moment, there frankly is no competition between R and Python.
Even though Python has excellent statistics packages like statsmodels, these
provide only a fraction of the statistical models and methods that are available from R
packages. Finally, although dynamic notebooks like Jupyter3 are widely used by
Python users, and are excellent too, it is not as easy to create reproducible reports,
for example for publication in scientific journals, using Jupyter as it is using
RMarkdown and knitr. In fact, currently the easiest way to write a Python-based
reproducible manuscript is to use Python within R using the reticulate package.
3 https://jupyter.org/
1.4 WHO IS THIS BOOK FOR?
As mentioned at the start of this chapter, the prototypical audience at whom this
book is aimed are those engaged in data analysis in scientific research, specifically
research at or beyond PhD level. In scientific research, statistics obviously plays a
vital role, and specifically this is based on using data to build and interpret statistical
or probabilistic models of the scientific phenomenon being studied. This book is
heavily focused on this particular kind of statistical data analysis. As we’ve
mentioned, in data science as it is practised in industry and business, often the other
‘culture’ of statistics (see Breiman, 2001), namely predictive analytics and algorithms,
is the dominant one, and so this book is not ideal for those whose primary data
science interests are of that kind.
We’ve explicitly stated that this book is intended for those doing research in the
social sciences, but this also requires some explanation. The explicit targeting of the
social sciences is largely just to keep some focus and limits to the sets of examples
that are used throughout the book. However, beyond the example data sets that are
used, there is little about this content that is of relevance to only those doing
research in social science disciplines. All the content on data wrangling, exploration
and visualization, statistical modelling, etc., is hopefully just as relevant to someone
doing research in some field of biology as it is to someone doing research in the
social sciences. The nature of the data in terms of its complexity, and the nature of
the analysis of this data using complex statistical models, are traditionally very similar
in biology and social sciences. In fact, the statistics practised in all of these fields has
all arisen from a common original source, particularly the early twentieth-century
pioneering work of R. A. Fisher (e.g. Fisher, 1925).
We assume that the readers of this book will already be familiar with statistics to an
extent. For example, we assume that they’ve taken undergraduate-level courses
introducing statistics as it is used and applied in some discipline of science. We will
present the statistical methods that we cover from a foundational perspective, and so
not assume that readers are already confident and familiar with the fundamental
principles of statistical inference and modelling. However, we do assume that they
will have already had an introduction to statistics so that concepts like the ‘normal
distribution’, ‘linear regression’ and ‘confidence intervals’ will be relatively familiar,
even if they don’t have a very precise grasp of their technical meaning. On the other
hand, we do not assume any familiarity with any computing methods, nor R in
particular. In fact, we assume that many readers will be brand new to R.
1.5 THE STYLE AND STRUCTURE OF THIS BOOK
Apart from this brief introductory chapter, all the remainder of the book is a blend of
expository text, R code, mathematical equations, diagrams and R-based plots. It is
intended that people will read this book while using R to execute all the code
examples and so produce all the results that are presented either as R output or as
figures. Of course, if readers wish to read first and then run the code later, perhaps
on a second reading, that is entirely a matter of preference. However, all the code
that we present throughout this book is ready to run, and does not require anything
other than the R packages that are explicitly mentioned in the code and the relevant
data sets, which are all available on the website that accompanies the book.
The book is divided into three parts. Part I is all about the parts of the data science
workflow shown in Figure 1.1 except for the statistical modelling loop part. Thus, in
Part I we provide a comprehensive general introduction to R, a chapter on data
wrangling using dplyr, tidyr, etc., a chapter on data visualization, and another on
data exploration. We then go into more detail about programming in R, and conclude
Part I with a chapter on doing reproducible data analysis using tools like RMarkdown
and Git. Part II of the book, which is the largest part, is all about the statistical
modelling loop part of the data science workflow. There, we provide a general
introduction to statistical inference, and then cover all the major types of regression
models, specifically normal linear regression, generalized linear models, multilevel
models, nonlinear regression, and path analysis and related models. In Part III,
which is the shortest part, we cover some specialized topics that are not necessarily
part of the statistical modelling topics, but not general or introductory either.
Specifically, in Part III, we provide an introduction to using R for high-performance
computing, making interactive graphics web apps using Shiny, and a general
introduction to Bayesian probabilistic programming using Stan.
PART I FUNDAMENTALS OF DATA
ANALYSIS AND DATA SCIENCE
PART I CONTENTS
Chapter 2: Introduction to R 11
Chapter 3: Data Wrangling 51
Chapter 4: Data Visualization 101
Chapter 5: Exploratory Data Analysis 153
Chapter 6: Programming in R 185
Chapter 7: Reproducible Data Analysis 221
2 INTRODUCTION TO R
What is R, and why should we use it? 12
A power tool for data analysis 12
Open source software 13
Popularity 14
Installing R and RStudio 14
Installing R 15
Installing RStudio Desktop 15
Guided Tour of RStudio Desktop 16
RStudio menus 18
First steps in R 19
Step 0: Using the R console 20
Step 1: Using R as a calculator 20
Step 2: Variables and assignment 23
Step 3: Vectors 24
Step 4: Data frames 32
Step 5: Other data structures 33
Step 6: Functions 36
Step 7: Scripts 37
Step 8: Installing and loading packages 40
Step 9: Reading in and viewing data 44
Step 10: Working directory, RStudio projects, and clean workspaces 47
2.1 WHAT IS R, AND WHY SHOULD WE USE IT?
While there are many ways of defining what R is, for most practical purposes, it is
sufficient to describe R simply as a program for doing statistics and data analysis. If
you’ve done any kind of statistics or data analyses, the chances are extremely high
that you’ve used some computer program to do so. The range of such programs is
large. They include SPSS, SAS, Stata, Minitab, Python, Matlab, Maple, Mathematica,
Tableau, Excel, SQL, and many others. These do not all do the same thing, and so
are not necessarily interchangeable. Some, like Python, are general-purpose
programming languages that have become widely used for data science. Others, like
SQL, are database language. SPSS is primarily a GUI program for statistics, originally
targeted at researchers in the social sciences. R can be seen as just another program
in this large and heterogeneous list. The advantages of R, however, which set it apart
from many other programs, boil down to three interrelated factors: it is immensely
powerful, it is open source, and it very (and increasingly) widely used. Let us now
consider each of these three points further.
A power tool for data analysis
The range and depth of statistical analyses and general data analyses that can be
accomplished with R are immense:
Built into R’s standard set of packages is virtually the entire repertoire of widely
known and used statistical methods. These include general and generalized
linear regression analyses (which themselves include analyses of variance, t-tests
and correlations), descriptive statistical methods and nonparametric methods.
Also built into R is an extensive graphics library (see the graphics package,
which is usually termed the ‘base R’ plotting package) for doing virtually the
entire repertoire of statistical plots and graphics, and these graphics tools can be
combined programmatically to lead to any desired plot or visualization.
In addition to its built-in tools, R has a vast set of add-on or contributed
packages. There are presently over 16,000 additional contributed packages (to
be precise, there are 16,105 packages as of 12 August 2020). While they differ
in size, each one will usually provide at least dozens of additional tools and
methods for statistics, data manipulation and processing, or graphics. Some of
these packages could be described as almost mini-languages in themselves. For
example, and as we’ll see below, the package ggplot2 is effectively a mini-
language for data visualization, while packages like dplyr and tidyr are
effectively mini-languages for data wrangling and manipulation. In addition,
because R is the de facto standard computing platform for the discipline of
statistics, almost every new or existing statistical technique developed by
statisticians is made available as a package in R. With all of these packages, we
are hard pressed to find anything at all related to statistics and data analysis,
including data graphics and visualization, that is not currently available in R.
As large as the set of R packages is, the capabilities of R do not stop here. R is a
high-level and expressive programming language that is specialized to efficiently
manipulate and perform calculations or analyses on data. This entails that R can
be used programmatically to greatly increase the speed and efficiency of any
data analysis. More importantly, R can be extended by writing custom programs
and functions, which may then be packaged and distributed for others to use.
While writing large or complex extension packages would require some
programming skill and experience, programming in R on a smaller and simpler
scale is in fact relatively easy, and basics can be mastered quickly. Given that R is
a programming language, there is then effectively no real limit on its capabilities.
The R programming language itself can be extended by interfacing with other
programming languages like C, C++, Fortran and Python. In particular, the
popular Rcpp package greatly simplifies integrating R with C++, thus allowing
fast and efficient C++ code to be used seamlessly within R. Likewise, R can be
easily interfaced with high-performance computing or big data tools like Hadoop,
Spark, SQL, parallel computing libraries, cluster computing, and so on.
Taken together, these points entail that R is an extremely powerful and extensible
environment for doing any kind of statistical computing or data analysis.
Open source software
R is free and open source software, distributed according to the GNU public licence.
Likewise, virtually all of the 16,000 or so contributed R packages are free and open
source software, with over 99% of them being distributed in accordance with one of
the major open source licences, such as GNU, MIT, BSD, Apache, Creative Commons
or Artistic. It is important to emphasize the distinctions in practice and in principle
between free and open source software, on the one hand, and freeware, on the
other. Freeware is proprietary software that is distributed, usually only in binary form
and with certain restrictions and conditions, at no monetary cost to the user. While it
can be used in a limited sense at no cost, it cannot be extended or developed, its
source code cannot be viewed, and its non-monetary cost can be revoked at any
time. Free and open source software, on the other hand, is licensed so that anyone
can use it and develop it in any manner, including and especially by viewing and
extending its source code. Free and open source software is defined by four essential
freedoms:1
1 https://www.gnu.org/philosophy/free-sw.html
The freedom to run the program in any manner and for any purpose
The freedom to study and modify the source code
The freedom to distribute copies of the original code
The freedom to distribute modified versions of the code.
In practical terms, the most obvious consequence of R’s free and open source nature
is that it is freely available for everyone to use, on more or less any device they
choose. It is mostly widely used on Windows, Macs, and Linux, but because it is
available in open source it can in principle be compiled for any platform, and can be
used on Android, iOS, Chrome OS, and many others. This means that anyone can use
R at any time anywhere and always at no cost. And because of its licence, this will
always be the case.
Open source software always has the potential to ‘go viral’ and develop a large self-
sustaining community of user/developers. This is precisely what has happened in the
case of R. Users are drawn in initially because it is available at no cost, can be used
on any platform, and has a large number of built-in or add-on tools. Because R is an
open platform, developers such as academic statisticians or data scientists who want
to reach a large audience write further add-on packages and make them publicly
available. This draws in more users. The users themselves may write blogs, books,
articles, or teach with R, thus attracting still more users, and so on.
Popularity
The Journal of Statistical Software2 is the most widely used academic journal
describing advances and developments in software for statistics. While it accepts
articles describing methods implemented in a wide variety of languages, it is
overwhelmingly dominated by programs written in R. This fact illustrates that when it
comes to the computational implementation of modern statistical methods, R is the
de facto standard.
2 https://www.jstatsoft.org
In an extensive analysis of general data science software (Muenchen, 2019), R is
ranked as one of the five most popular data science programs in jobs for data
scientists, and in multiple surveys of data scientists, it is often ranked as the first or
second mostly widely used data science tool, and among the most widely ‘followed’
topics on Quora and LinkedIn. Likewise, despite being a domain-specific language,
according to many rankings of widely used programming and scripting languages
worldwide, R is currently highly ranked. Indeed, R is currently very highly ranked
according to many rankings of widely used programming languages of any kind. For
example, the latest RedMonk ratings3 place R at rank 13; the latest TIOBE ratings4
place R at rank 8; and the latest PYPL ratings5 place R at rank 7.
3 https://redmonk.com/sogrady/2020/07/27/language-rankings-6-20/
4 https://www.tiobe.com/tiobe-index
5 http://pypl.github.io/PYPL.html

2.2 INSTALLING R AND RSTUDIO


Installing R is usually a painless process, but it also usually involves two, or maybe
three, separate steps. First, R itself must be installed. This will install the R
interpreter and also what we will call R’s standard library. The interpreter is the
means by which all our R commands are converted into machine code that is then
executed on our machine. The standard library is the set of built-in packages
mentioned above that provide all the basic or most widely used tools for doing
statistics, analysis and visualization. With that, we will have a fully functioning R
environment. However, by stopping here, the interface through which we’ll interact
with R will be very minimal and lacking many features that would make our use of R
more pleasant and efficient. As such, the second installation step will be to install the
RStudio Desktop environment. This is a very popular interface to R that will greatly
transform the ease and efficiency with which we work with R. Throughout the
remainder of this book, we’ll assume that we’re always working with the RStudio
Desktop. The third installation step is the installation of extra R packages. As
mentioned above, R has over 16,000 add-on packages. Users can install them on
demand from within R in a manner just like installing apps on a mobile device. When
first installing R, it’s often a good idea to also install a minimal set of must-have
packages. After that, additional R packages can be installed as and when they are
needed.
We will now describe how to install R and the RStudio Desktop, and then, after we
say more about how to use R and RStudio, we will look at installing R packages in a
separate section.
Installing R
To install the latest version of R on Windows, go to:
https://cran.r-project.org/bin/windows/base/
For the installer for the latest version of R for Macs, go to:
https://cran.r-project.org/bin/macosx/
For Windows, the installer is an executable file (with an .exe extension). For Macs, it
is the installer file ending with the .pkg extension. In either case, always go for the
installer of the latest R version (which, as of August 2020, is 4.0.2). We install R with
these installers just as we would install any other program on Windows or Macs.
If we’re using Linux, we can follow steps similar to the above to get a installer.
However, it is likely that everything will be simpler if we just use our Linux
distribution’s own package manager and install R with that. For example, with
Debian- and Ubuntu-based distributions, we can use the apt-get package manager, or
use the pacman installer on Arch Linux.
Installing RStudio Desktop
The RStudio Desktop is one of the software products created by the company
RStudio. Although RStudio is a private commercial company, it releases its software
under a free and open source licence (namely the GNU Affero General Public
License). It also sells technical support for these products, but this is aimed at
companies and organizations rather than individuals. Simply put, we can use RStudio
software just as we use all R software: at no cost, and according to a free and open
source licence. Any prices that are listed for RStudio software are for commercial
support.
The webpage
https://www.rstudio.com/products/rstudio/download/
provides us with the necessary links to the Windows, Mac, and Linux installers.
Choose the RStudio Desktop: Free option, which we will see is listed as under an
open source licence. The latest version (as of August 2020) is 1.3.1056. We use the
installers here just as we would use any Windows and Mac installers. For Linux users,
there are versions for popular Linux distributions such as Ubuntu and Fedora.
2.3 GUIDED TOUR OF RSTUDIO DESKTOP
Having installed R and the RStudio Desktop, we can now effectively forget about R. It
is fully installed, and it will be doing all the computing whenever we use RStudio
Desktop, but it runs in the background and we don’t have to use it directly. For all
practical purposes, it will seem like we are just using a single standalone desktop
application.
A note on terminology. Strictly speaking, RStudio is the company that created and
maintains the RStudio Desktop software, among other pieces of software. In practice,
almost everyone, including those who work for RStudio itself, refer to RStudio
Desktop simply as RStudio, and we’ll do so here as well. Another product made by
RStudio, the RStudio Server, which we’ll describe below, is also sometimes called
RStudio, but we’ll usually explicitly refer to that as the RStudio Server.
We open RStudio just as we would any other desktop application (e.g. double-clicking
an icon, or typing its name in a launcher), and when we do so, we will be greeted by
something that should look exactly like Figure 2.1.

Figure 2.1 The typical layout of the RStudio Desktop when it is first opened
We will see three main windows, which we will describe in more detail below. For
now, we see that to the left, occupying about half the screen by default, is a window
with the console pane. To the right, there are two other windows arranged vertically.
Usually (in fact almost always) we also have two windows on the left. On top of the
window with the console pane, we usually have a script editor. If it is not present, it
can be brought up by the key command Ctrl+Shift+N (Cmd+Shift+N on Macs), or by
going to the File menu at the top of the screen and choosing New File > R Script.
Doing so will create a blank and untitled R script in the script editor window, which
will now occupy the upper left quadrant of the screen. Our screen should now look
like Figure 2.2.
Figure 2.2 RStudio Desktop with R script editor window in upper left quadrant
We’ll now look in more detail at each of these four windows.
Console window The console window usually occupies about half of the left-
hand side of the screen. Like all windows, it can be resized with the mouse, or
with the window resize buttons to the upper right of each window. In the
console window, by default, there are tabs for three panes: the Console, the
Terminal, and Jobs. The tab for the console is usually the active one. This is
where we type R commands, followed by Enter, and get all our results and
output. It is the single most important part of the RStudio Desktop. We will use it
extensively, beginning with our introduction to R commands in the next section.
The next tab is the Terminal, and this is the command line interface to our
computer’s operating system. So, on Windows, this is usually the DOS command
line. On Macs and Linux, it is the Unix shell, such as the bash shell. Unlike the
console, the terminal is not as widely used. It may in fact never be used, and is
only necessary when we need a command line interface to our operating system.
The Jobs pane is also not as widely used. It is for running scripts
asynchronously.
Script editor window The script editor is where we write scripts of R
commands. We use scripts whenever we want to save our R commands for later
reuse, or whenever the R commands are becoming relatively long and complex.
We write here just as we would write in any text file editor (e.g. Notepad), and
we can save these files on our computer’s file system as normal. As we’ll see
below, we can execute or run the commands we write in any script either line by
line or region by region or by executing the whole script at once. If we run
individual lines or regions, the R code effectively gets copied to the console
followed by Enter just as if we copied and pasted from the editor to the console.
We can run the whole script at once, as we’ll see below, by using the source
command, for which there is a button on the upper right of the editor window.
Environment, History, etc., window In the upper right pane, there are tabs
for the Environment, History, and Connections panes. Sometimes there are other
tabs, such as for Build and Git. Of these, Environment and History are the more
commonly used. As we’ll see as soon as we start using R commands and
creating variables, the details of the variables and data structures that are in our
current R session are listed in the Environment. We will have the option of
clearing or deleting any or all of these whenever we wish. Likewise, we may save
all of these objects to file, and reload them later. The Environment window also
provides us with a convenient means of importing data files. The History window
provides a list of all the R commands that we have typed. This is a particularly
useful feature, as we will see. It allows us to review everything we have typed in
our R session, and allows us to extract and rerun any commands we want. We
may also save our command history to file at any time. The other tabs available
in this window are not usually used as often. Connections is used for connecting
with databases or clusters, and we will see it again later in the book. Build,
which may not be listed at all, is for building packages or compiling code. Git,
which likewise may not exist, is for version control of our code using Git. We will
talk about using Git for version control in Chapter 6.
Files, Plots, Packages, Help, Viewer window The lower right window
provides tabs for browsing files, viewing plots, managing R packages, reading
help files, and for viewing html documents. The Files window is a regular file
browser, where we can view, create, delete, etc., files and directories. The Plots
window is where all figures created during our R session are shown. If we
produce many figures, they are placed in a stack and we can move forwards and
backwards between them. The Packages window, which we will return to in
more detail in the next section, lists all our installed R packages. From here, we
can also activate packages for our current R session, as well as install new
packages. The Help window displays help pages for any R command or package.
We can browse through these pages, but a particularly useful feature is how we
can jump straight to a needed help page for a command or package directly
from the R command line or script editor. We will return to this feature in one of
the next sections. The Viewer window is where we can view html pages that are
created in RStudio. These could include Shiny web apps or the html pages
produced by RMarkdown documents. These are topics to which we will return in
later chapters.
RStudio menus
At the top of the RStudio Desktop there is the following set of menus.
File The File menu is primarily for the opening, closing and saving of files. Often,
these files will be R scripts that open in the script editor. But they could also be R
data files, RMarkdown documents, Shiny apps, etc. Here, we can also open and
close RStudio projects, which is a very useful organizing feature to which we will
return below.
Edit The Edit menu primarily provides tools for standard file editing operations
such as copy, cut, paste, search and replace, and undo and redo. It also provides
code folding features, which is very useful for reducing clutter when editing
relatively large R scripts.
Code The Code menu provides many useful tools for making editing and running
code considerably easier and more efficient. We will explore these features in
more depth in subsequent sections, but they include adding and removing code
comments, reformatting code, jumping to functions within and between scripts,
and creating code regions that can then be run independently.
View The View menu primarily provides options to move around RStudio quickly.
These options are all bound to key combinations, as are many other RStudio
features, and learning these key combinations is certainly worthwhile because of
the eventual speed and efficiency gains that they provide.
Plots The Plots menu primarily provides features that are also available in the
Plots window itself.
Session The Session menu allows us to start new separate RStudio sessions.
These then run independently of one another. Also in the Session menu, we can
restart the R session in the background, which is a useful feature. Remember
that RStudio itself is just a front to an R session that runs in the background.
Sometimes it is a good idea to restart the R session so as to start in a clean and
fresh state. This can be done through the Session menu Restart R option, which
is also bound to the key combination Ctrl+Shift+F10. Also in Session are options
to set R’s working directory. The concept of a working directory is a simple but
important one, and we will cover it below.
Build The Build menu provides features for running scripts for software builds.
This is particularly used for creating R packages.
Debug The Debug menu provides tools for debugging our R code. Debugging
usually only becomes a necessity when R programming per se, and not
something that is usually required when writing individual commands or scripts
of commands.
Profile The Profile menu provides tools for profiling the running and efficiency
of our R code. Code efficiency is certainly not something that those new to R
need to worry about, but when writing relatively complex code, profiling can
identify bottlenecks.
Tools The Tools menu provides miscellaneous tools such as for working with
version control using Git, accessing the computer operating system’s command
line interface, installing and updating packages (as could also be done in the
Package window), and viewing and modifying keyboard shortcuts. Here, we can
also access the Global Options and Project Options. Global Options is where all
the general R and RStudio settings are set. One immediately useful setting here
is the Appearance setting, which can allow us to change the font, font size, and
colour theme of RStudio to suit our preference. Project Options are for the
RStudio project-specific settings. We will return to these below.
Help The Help menu provides much the same information as can be found in
the Help window. It also provides some additional links to online resources, such
as RStudio’s cheat sheets,6 which are excellent concise guides to many different
R and RStudio topics. Also available in the Help menu are tools to access RStudio
internal diagnostics. This is only needed if RStudio seems to be malfunctioning.
6 https://www.rstudio.com/resources/cheatsheets/

2.4 FIRST STEPS IN R


R is very a powerful tool. While it is unquestionably a wonderful thing that everyone
can have access to this powerful tool at no monetary cost, now or ever, R’s power
can also be intimidating and off-putting at the beginning. People who wish to learn R
may simply not know where to begin. Worse still, some people dive in too deep at
the beginning, tackling complex analyses before they have a good handle on the
basics, and find things difficult and frustrating, and may then even abandon trying to
learn R at all.
To learn R, it is best to learn the fundamentals first. In what follows, we have
provided a sequence of steps that cover many of these. With each step, we’ll
introduce some fundamental concepts or tools, and these concepts and tools can
build on one another and be combined to lead to yet more concepts and tools. What
we will not cover in these steps are the major topics like data wrangling, data
visualization and statistical analyses. These will be covered in depth in subsequent
chapters, and build upon a knowledge of the fundamentals that we cover here.
Step 0: Using the R console
R is a command-based system. We type commands, R translates them into machine
instructions, which our computer then executes, and then we often, but not
necessarily, get back some output. The commands can be typed into the R console,
or else they can be put into a script and run as a batch. When learning R, it is usually
best to start with typing commands in the R console.
When we initially open RStudio, our console will usually look something like this:
R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
>
Notice that, at the bottom, there is a single line beginning with >. This is the R
console’s command prompt, and it is where we type our commands. We then press
Enter, and our command is executed. The output of the command, if any, is displayed
on the next line or lines, and then a new > prompt appears.
Step 1: Using R as a calculator
A useful way to think about R, and not an inaccurate one either, is that it is simply a
calculator. It is useful therefore to learn R by first treating it just like an ordinary
handheld calculator, and then learning the ways it extends or goes beyond the
capabilities of a calculator. Just as we would with a calculator, we can start using R by
doing arithmetic – adding, subtracting, multiplying, dividing, and so on.
So, let’s start by adding 2 and 2. We do this by typing 2 + 2 at the command
prompt, and then pressing Enter. What we see should look exactly7 as follows:
7 In this book, each line of the console’s output begins with #>, which will not appear
in the actual console. We use the #> just to make it easier to read and to distinguish
between the commands that we input into the console and the output that is then
displayed.
> 2 + 2
#> [1] 4
Notice that the result of the calculation is displayed on the line following the
command. That line begins with a [1], and we will return to what this means below
but for now it can be ignored.
Now let’s do some more. Each time, we’ll type the command, press Enter, get the
result, and then a new > prompt occurs, and we type the next command and so on.
So calculating the sum of 3 and 3, and of 10, 17 and 5, will lead to the following:
> 3 + 3
#> [1] 6
> 10 + 17 + 5
#> [1] 32
Using history. Before we proceed, take a look at our History window. It should look
like this:
2 + 2
3 + 3
10 + 17 + 5
In other words, it is a list of everything we’ve just typed. If we click on any line here,
we will select it. If we then click on To Console, this line will be copied to the console,
and we can press Enter to re-execute. More usefully, we can move through our
history with the up and down arrow keys on our keyboard. Repeatedly pressing the
up arrow key will bring up the list of each command we just typed, and repeatedly
pressing the down arrow key will bring us back down. At any point as we go through
the list, we can press Enter to rerun that command, which then adds that command
onto the end of the History.
Let’s look at some more arithmetic. For subtraction, we use the - symbol:
> 5 - 3
#> [1] 2
> 8 - 1
#> [1] 7
Multiplication uses the asterisk symbol *:
> 5 * 5
#> [1] 25
> 2 * 4
#> [1] 8
Division uses the / symbol:
> 1 / 2
#> [1] 0.5
> 2 / 7
#> [1] 0.2857143
Exponents (i.e. raising a number to the power of another number) are accomplished
with the caret symbol ^:
> 2 ^ 8
#> [1] 256
> 10 ^ 3
#> [1] 1000
Exponents also work if we use **, but this is less common:
> 2 ** 8
#> [1] 256
> 10 ** 3
#> [1] 1000
Just like on a calculator, we can combine the +, -, *, /, ^ operators in any
commands:
> 2 + 3 - 6 / 2
#> [1] 2
> 10 * 2 - 3 / 5 ^ 2
#> [1] 19.88
Note that precedence order of operations will be ^ followed by * or / followed by +
or -, and just like on a calculator, we can use parentheses to control the order of
operations:
> 2 / (3 * 2)
#> [1] 0.3333333
> (2 / 3) * 2
#> [1] 1.333333
In the above, we were always dealing with positive values. However, we can always
precede any number with a - to get its negative value:
> -2 * 3
#> [1] -6
> 10 ^ -3
#> [1] 0.001
Notice that throughout all the above commands, we put a space around the +, -, *,
/, ^ operators. This is a matter of recommended style to improve readability, not a
requirement. In other words, the following all work in exactly the same way:
> 2+3
#> [1] 5
> 2+ 3
#> [1] 5
> 2 + 3
#> [1] 5
It is just recommended to use the 2 + 3 version. On the other hand, while it is also
possible to have spaces after the - sign that we use to negate a value, the
recommended style is to not do so, and so we use 3 / -2 and not 3 / - 3,
though both will do the same thing.
Step 2: Variables and assignment
A major step forward in using R, and a major step beyond the capabilities of a
handheld calculator, is the use of variables and the assignment of values to variables.
In effect, this simply allows us to store the values of calculations for later use, but
that in fact is a very powerful thing.
Consider what happens when we type the following at the command prompt and
then press Enter:
> (12 / 3.5) ^ 2 + (1 / 2.5) ^ 3 + (1 + 2 + 3) ^ 0.33
#> [1] 13.6254
All the constituent numbers are stored in our computer’s memory, calculations are
done on them, and then the result is stored in memory. This is then displayed as the
output on our screen, and because nothing further needs to be done with it, it is
removed from memory. We can, however, keep this value stored in memory by
assigning it to a variable. We do this with the assignment operator <-, which is a <
symbol followed directly by a - symbol. The <- can be typed by key combination
Alt+- (i.e. Alt key and minus key together) or Option+- on the Mac.
So if we want to assign the value of the above calculation to a variable named x, we
would type
> x <- (12 / 3.5) ^ 2 + (1 / 2.5) ^ 3 + (1 + 2 + 3) ^ 0.33
Notice that on pressing Enter here, there is no output on screen. The calculation is
done as normal but rather than outputting it to the screen, the value is assigned to
the variable named x. If we now type the name x at the prompt, its value will be
displayed:
> x
#> [1] 13.6254
We can then do calculations with this variable just like we would with any other
number:
> x ^ 2
#> [1] 185.6516
> x * 3.6
#> [1] 49.05145
And we can assign any of these values to new variables:
> y <- x ^ 2
> y <- x * 3.6
In general, the assignment rule is
name <- expression
The expression is any R code that returns some value. This could be the result of
a calculation, as in the above examples, but it could also be the ‘output’, or result, of
some complex statistical analysis, which is something we will see repeatedly in later
sections and chapters. In the simplest case, it could also be just a single numeric or
other value, such as
> x <- 42
The name has to follow certain naming conventions. In all the above examples, we
used a single lowercase letter, but in general it can consist of multiple characters.
Specifically, it can consist of letters, which can be either lowercase or uppercase,
numbers, dots, and underscores. It must, however, begin with a letter or a dot that is
not followed by a number. So all of the following are acceptable:
x123
.x
x_y_z
xXx_123
But all of the following are not acceptable:
_x
.2x
x-y-z
Although many names like x123 etc. are valid names, the recommendation is to use
names that are meaningful, relatively short, without dots (using the underscore _
instead for punctuation), and primarily consisting of lowercase characters. Examples
like the following are recommended:
> age <- 29
> income <- 38575.65
> is_married <- TRUE
> years_married <- 7
Step 3: Vectors
Up to now, all the values we’ve been dealing with have been single values. In
general, we can have variables in R that refer to collections of values. These are
known as data structures. There are many different types of data structures in R, but
the two that we encounter most often are vectors and data frames. Data frames are
probably the single most important data structure in R and are the default form for
representing data sets for statistical analyses. But data frames are themselves
collections of vectors, and so vectors are a very important fundamental data structure
in R. In fact, as we will see, single values, like all those used above, are actually
vectors with exactly one element.
Vectors are one-dimensional sequences of values. While they will often be created for
us by the R functions that we use, such as by some data analysis functions, we can
also create vectors ourselves using the c() function. For example, if we want to
create a vector of the first 10 prime numbers, we could type the following:
> primes <- c(2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
To use the c() function, where c stands for combine, we simply put within it a set of
values with commas between them.
Vector operations
We can now perform operations, like those we saw above, on the primes vector just
as we would on a single-valued variable. For example, we can do arithmetic
operations:
> primes + 1
#> [1] 3 4 6 8 12 14 18 20 24 30
> primes / 2
#> [1] 1.0 1.5 2.5 3.5 5.5 6.5 8.5 9.5 11.5 14.5
> primes ^ 2
#> [1] 4 9 25 49 121 169 289 361 529 841
In these cases, the operations are applied to each element of the vector.
Indexing
For any vector, we can refer to individual elements using indexing operations. For
example, to get the first or fifth elements of primes, we would use square brackets
as follows:
> primes[1]
#> [1] 2
> primes[5]
#> [1] 11
If we want to index sets of elements, rather than just individual elements, we can
use vectors (made with the c() function) inside the indexing square brackets. For
example, if we want to extract the seventh, fifth and third elements, in that order, we
can type the following:
> primes[c(7, 5, 3)]
#> [1] 17 11 5
If we want to refer to a consecutive set of elements, such as the second to the fifth
elements, we can type the following:
> primes[2:5]
#> [1] 3 5 7 11
In R, the expression n:m, where n and m are integers, gives us the vector of integers
from n to m. So, for example, 2:5 means the same thing as c(2, 3, 4, 5).
If we use a negative-valued index, we can refer to all elements except one. For
example, all elements of primes except the first element, or except the second
element, can be obtained as follows:
> primes[-1]
#> [1] 3 5 7 11 13 17 19 23 29
> primes[-2]
#> [1] 2 5 7 11 13 17 19 23 29
If we precede a vector of indices by a minus, we’ll return all elements except those in
the index vector. For example, we can get all elements except the seventh, fifth and
third elements as follows:
> primes[-c(7, 5, 3)]
#> [1] 2 3 7 13 19 23 29
Single-valued vectors
Unlike some other programming languages, R does not represent single values as
elementary data types. Single values are in fact just vectors with only one element,
and so can be indexed, and so on, just like any other vector:
> x <- 42
> x[1]
#> [1] 42
Character vectors
Our primes vector is a sequences of decimal numbers. We can verify this by
applying the class function to the vector and seeing that it is a numeric vector:
> class(primes)
#> [1] "numeric"
We can, however, have many other types of vectors. For example, we can have
character strings, which are string of characters that are surrounded by quotation
marks. These are, in fact, very widely used in R. For example, here’s a vector of the
names of the six nations taking part in an annual rugby union tournament:
> nation <- c('ireland', 'england', 'scotland', 'wales',
'france', 'italy')
This vector is of type character, which we can verify as follows:
> class(nation)
#> [1] "character"
Note that we can use single or double quotation marks for each string, as in the
following example:
> nation <- c("ireland", 'england', "scotland", 'wales',
'france', 'italy')
Just like numeric vectors, we can index character vectors:
> nation[1]
#> [1] "ireland"
> nation[-2]
#> [1] "ireland" "scotland" "wales" "france" "italy"
> nation[4:6]
#> [1] "wales" "france" "italy"
We cannot, however, perform arithmetic functions on character vectors. We will
obtain an error if we try:
> # does not work
> nation + 2
#> Error in nation + 2: non-numeric argument to binary operator
> nation * 2
#> Error in nation * 2: non-numeric argument to binary operator
Logical vectors
Another widely used type of vector is a logical or Boolean vector. A Boolean variable
is a binary variable that takes on values of true or false. In R, these values are
represented using TRUE or T for true, and FALSE or F for false. For example, a
vector of values representing whether each of a set of five people is male could be as
follows:
> is_male <- c(TRUE, FALSE, TRUE, TRUE, FALSE)
This could also be created more succinctly as follows:
> is_male <- c(T, F, T, T, F)
Using class, we can verify that this vector is a logical vector:
> class(is_male)
#> [1] "logical"
We can index logical vectors just like numeric or character vectors:
> is_male[2]
#> [1] FALSE
> is_male[c(1, 2)]
#> [1] TRUE FALSE
Arithmetic operations on logical vectors can be applied to logical vectors, but only by
first converting TRUE and FALSE to the numbers 1 and 0, respectively:
> is_male * 2
#> [1] 2 0 2 2 0
> is_male - 2
#> [1] -1 -2 -1 -1 -2
The vector returned by operations like this is a numeric vector:
> result <- is_male * 2
> class(result)
#> [1] "numeric"
We can also apply Boolean or logical operations to logical vectors. These logical
operations are AND, OR, and NOT. The AND operator tests if two logical values are
both true. In R, it is represented by &:
> TRUE & FALSE
#> [1] FALSE
> TRUE & TRUE
#> [1] TRUE
In these examples, we have a single Boolean variables on either side of the &, but
given that a single value is a vector of one element, we could also apply the & to
vectors of multiple elements:
> c(T, F, T) & c(T, T, F)
#> [1] TRUE FALSE FALSE
In this case, the & operation is individually applied to the first, second, and third
elements of both vectors.
The OR operator tests if one or the other is true, and is represented by the |
character:
> TRUE | FALSE
#> [1] TRUE
> TRUE | TRUE
#> [1] TRUE
We can negate a logical value by using the ! operator:
> !TRUE
#> [1] FALSE
> !FALSE
#> [1] TRUE
Just as we can combine arithmetic operations, so too can we combine logical
operations, using parentheses to control the order of operations if necessary:
> (TRUE | !TRUE) & FALSE
#> [1] FALSE
> (TRUE | !TRUE) & !FALSE
#> [1] TRUE
Equality/inequality operations
Equality and inequality operations can be applied to different types of vectors, and
also return logical vectors. For example, we can test if each value of a vector of
numbers, such as primes, is equal to a specific value as follows:
> primes == 7
#> [1] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
FALSE
Again, given that a single value is a vector of one element, we can also apply this
operator to individual values:
> 12 == (6 * 2)
#> [1] TRUE
And we can test if values are not equal with !=, as in the following examples:
> 12 != (6 * 2)
#> [1] FALSE
> primes != 3
#> [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
We can test if numbers are less than, or greater than, another number with the <
and > operators, respectively:
> primes < 5
#> [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE
> primes > 8
#> [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
We can test if numbers are less than or equal to, or greater than or equal to, another
number with the <= and >= operators, respectively:
> primes <= 7
#> [1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> primes >= 3
#> [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
We can also apply equality or inequality operations to character vectors:
> nation == 'england'
#> [1] FALSE TRUE FALSE FALSE FALSE FALSE
> nation != 'england'
#> [1] TRUE FALSE TRUE TRUE TRUE TRUE
The meaning of some inequality operations, such as the following, may be initially
unclear:
> nation >= 'italy'
#> [1] FALSE FALSE TRUE TRUE FALSE TRUE
In this case, greater than or less than is defined by alphabetical order, and so
nation >= 'italy' is evaluating whether each listed nation is alphabetically after,
or the same as, the string italy.
Coercing vectors
An important property of all vectors is that they are homogeneous, in that all their
elements must be of the same data type. For example, we can’t have a vector with
some numbers, some logical values, and some characters. If we try to make a
heterogeneous vector like this, some of our elements will be coerced into other
types. If we, for example, try to combine some logical values with some numbers,
the logical values will be coerced into numbers (TRUE will be converted to 1, FALSE
will be converted to 0):
> c(TRUE, FALSE, 3, 2, -1, TRUE)
#> [1] 1 0 3 2 -1 1
If we try to combine numbers of logical values with character strings, they will all be
coerced into strings, as in the following example:
> c(2.75, 11.3, TRUE, FALSE, 'dog', 'cat')
#> [1] "2.75" "11.3" "TRUE" "FALSE" "dog" "cat"
Combining vectors
Random documents with unrelated
content Scribd suggests to you:
“Gee! I thought of that,” declared Patsy. “You have hit the nail on the
head, chief, for fair.”
“I think that these crooks, in order to expedite matters and create a
general belief that Gordon has murdered Pauline Perrot, planted this
evidence and probably more, and immediately started Henley with it to
inform the constable, aiming to get in their work on old Mr. Strickland as
soon as possible. I saw that Henley was a bit set back when he discovered
my identity and that I already was at work on the case.”
“I noticed that, too, chief,” put in Patsy.
“Henley decided to seize the bull by the horns, however, pretending he
wanted to aid me, and I think he now has something up his sleeve,” Nick
added. “I’m going to give him a chance to show his hand.”
“How so?” Chick questioned.
“I’m not yet sure what I shall frame up. Be that as it may, Chick, you hike
back to town and get after Dayton. It’s dollars to fried holes that he has a
hand in this game. Use your own judgment as to the best course to shape,
and leave Patsy and me to tie knots in this end of the string. That’s all for the
present.”
“Enough said, too, Nick,” replied Chick, seizing his hat. “You have
pulled off a clever bit of work, remarkably clever, and we’re now right in
line to deliver the goods. Leave Dayton to me. I’ll get him.”
Chick did not wait for an answer. He hurried out of the house and started
for town in the taxicab.
CHAPTER VII.

HENLEY SHOWS HIS HAND.

It was, indeed, a clever bit of detective work that had enabled Nick Carter
to form a theory consistent with all of the circumstances and the
accumulation of evidence denoting that Arthur Gordon was guilty of the
basest of treachery and the most heinous of crimes, and which would have
been convincing not only to the public, but probably to all other detectives
than Nick Carter himself.
He keenly realized, however, that a theory based only upon his own
convictions was not enough, that absolute evidence was needed to convince
others, and he was not long in hitting upon a plan by which he thought he
could obtain it.
Nick hurriedly explained it to Patsy, giving him a few necessary
instructions, and he then sent him to call the suspected man from the kitchen.
Henley came slouching into the library a moment later, with Ginger
trailing at his heels. He had a more lowering look in his shifty eyes. He had
become impatient and suspicious during his long wait. He did not fancy his
having been excluded from the conference of the detectives. It smacked of
distrust of him, and his resentment was manifest in his swarthy face.
Nick saw it, of course, and at once took steps to dispel it.
“Pardon me, Henley, for keeping you waiting so long,” he apologized
with a heartiness well calculated to be convincing. “I had no idea it would
take more than a few minutes to examine these articles. Sorry to have kept
you waiting.”
“That’s all right, Mr. Carter,” growled Henley, with countenance lighting.
“Time ain’t wuth much to me. I reckoned you’d want a good look at them.”
“I have examined them carefully, Henley.”
“What d’ye think about it?”
“It looks like a bad mess, very bad,” Nick said, more gravely.
“So it does,” Henley nodded. “There ain’t nothing to it but murder, that I
can see.”
“I’m inclined to agree with you,” Nick replied.
“Sure thing, chief,” put in Patsy. “What else can you make of it? It’s dead
lucky we met Mr. Henley. He sure has put us on the right track.”
“And he can do still more to aid us,” supplemented Nick approvingly. “I
suppose, Henley, you are perfectly willing to assist us. You will be well paid
for your services. I guarantee that.”
“Your word’s good enough for me, Mr. Carter,” said Henley, consenting
with a readiness denoting that his misgivings were entirely dispelled. “I’m
right here to lend you a hand. Say what you want, sir, and I’ll do it.”
“Good enough,” Nick declared. “We’ll set about it at once. Find the
butler, Patsy, and have him give you a pair of Gordon’s shoes. I will look
after those left by the girl. We’ll leave these other articles until we return. I’ll
take the precaution, however, to lock the library door. Get Gordon’s shoes
and rejoin us in the car.”
Patsy hastened from the room, then started upstairs to say a few
encouraging words to Strickland and Wilhelmina.
“I wish to visit the spot where you found these garments, Henley, or
where Ginger nosed them out, to be more correct,” said Nick, taking only the
pair of button boots from the table and thrusting them into his pocket.
“I’ll show you,” said Henley. “That won’t take long.”
“We will expedite matters by going in my car as far as possible,” Nick
added. “Bring along the dog. We may find him useful.”
“He’s some dog, Mr. Carter; you can bet on that.”
“He looks it, Henley, no mistake. One moment while I lock this door and
remove the key. Now, then, we’re off.”
Nick led the way out to the touring car, in which Patsy presently joined
them, bringing a pair of Gordon’s shoes, and in another moment they were
speeding down the long driveway toward the woodland road.
“Take us to the point where we picked Henley up, Danny,” Nick directed.
“He then can take the ribbons and show us the way.”
“You can run a quarter mile farther,” said Henley. “That’ll take us to the
crossroad. It’s rough going, then, too rough for a buzz car.”
“We will walk the remaining distance, Henley, in that case,” Nick replied,
all the while with an air of friendliness and appreciation of his services that
appeared to deceive the swarthy ruffian. “I think you said it is less than a
mile from the road to the pond you mentioned.”
“ ’Tain’t more than half a mile.”
“Just where did you see Gordon and the girl last evening?”
“Going through the crossroad.”
“We traced them to the juncture of the two roads.”
“It was a quarter mile from there that I saw them.”
“Was Gordon carrying a suit case?”
“That’s what,” nodded Henley. “The girl had her jacket over her arm. The
man had an ugly look, and they seemed to be in a fuss over something, but I
couldn’t hear what they said. I watched them till they turned a bend in the
road, and that was the last I saw of them.”
“Gordon looked threatening, did he?”
“I sure would have thought so, Mr. Carter, if he had been looking at me,”
Henley forcibly declared. “He looked fit to fight a dog.”
If Nick Carter had wanted further evidence of Henley’s complicity in the
knavish game that was being played, these last statements would have
convinced him of it, in view of his own discoveries and deductions. He did
not betray his suspicions, but pretended to have entire confidence in the
rascal, interrogating him along much the same lines until Danny brought the
car to a stop at the crossroad.
Nick was the first to alight, followed by Henley and the hound, while
Patsy paused to question:
“Am I to go with you, chief?”
Nick hesitated for a moment, as if he had given this matter no previous
thought, and he then said abruptly:
“No, you’ll not be needed. Henley and I can look over the ground and
accomplish all that can be done.”
“Sure we can,” put in Henley, with ill-concealed eagerness.
“You return with Danny, Patsy, and keep an eye on those things in the
library. There is a bare possibility that some one will try to destroy them, in
case our suspicions are known.”
“That’s right, too,” Patsy quickly agreed. “I thought you were taking a
chance, chief, in leaving them there.”
“You return and look after them,” Nick repeated decidedly. “I’ll hoof it
back with Henley after making an investigation. He won’t mind the tramp.”
“Mind it be hanged!” cried Henley. “Tramping round these diggings is
the most that I do.”
“That settles it, then,” said Nick. “Back into the crossroad to make a turn,
Danny, and wait for us at Gordon’s place.”
“I’ve got you, chief,” nodded Patsy. “We’ll keep an eye on things.”
Nick did not hasten his departure with Henley. He waited until Danny had
turned the touring car, then watched it speed away with both of his
assistants, till it vanished around a near bend in the road.
Henley stood silently watching him, with his shotgun under his arm.
There was a gleam of secret satisfaction deep down in his shifty eyes, an
ominous curve in his thin-lipped mouth. Both vanished instantly, however,
when Nick turned and said:
“Now, Henley, it’s up to you.”
“I’ll make good, all right,” was the reply, with a covert significance the
detective was quick to notice.
“Lead the way, then.”
“I’ll soon show you, Mr. Carter,” Henley added, with the same sinister
significance. “Come on, Ginger. He’s some dog, Carter, some dog. Ginger
can’t be beat.”
Nick did not reply. He followed the swarthy ruffian over the rough
crossroad, stopping at intervals to study the ground, stating that he wanted to
examine the footprints of the missing couple, if any could be found. He
delayed frequently in this way—but with an entirely different object in view.
Twenty minutes brought them to a path through the woodland, into which
Henley struck without hesitation, remarking grimly:
“They must have gone this way. It was on this side of the pond that
Ginger nosed out the bloodstained togs.”
“How far is the pond from here?” Nick inquired, following him.
“Not far,” Henley gruffly assured him. “It’s over the hill and down into
the valley. There’s another path on t’other side of it, leading to a road
running south.”
“Toward Fordham, then.”
“That’s what. Gordon must have known about the pond. ’Tain’t very big,
but it’s as deep as a volcano. The devil himself couldn’t raise a corpse sunk
to the bottom of it. Gordon knew that, mebbe.”
“Quite likely, Henley, since he evidently wanted to get rid of the girl,”
Nick allowed.
“That’s how it looks to me. Bear off this way, sir.”
Henley strode away to the left and plunged through the bushes and
underbrush, Nick following, with Ginger bringing up in the rear.
Ten minutes brought them in sight of the pond, shut in on all sides by a
thick belt of woods, and Nick followed his uncouth guide down to the edge
of it and to the spot he was seeking, a lonely and suitable place enough for
such a crime as superficially appeared to have been committed.
“Here’s the spot,” cried Henley, pointing to some trampled shrubs and
underbrush. “There’s the log where Ginger nosed out the girl’s hat and
jacket. They were rolled up and thrust under it, then partly covered with dirt
and leaves.”
“Yes, yes, I see.”
“Here’s blood on the bushes, and footprints in the ground and dry leaves,
as if the girl put up a fight to save herself from——”
“Stop a moment,” said Nick, intently viewing the evidence mentioned. “I
want to compare these shoes with the imprints.”
“Gordon’s shoes?”
“Yes. The button boots belong to the girl. She left them in a house where
she has been boarding.”
“You went there after them?” questioned Henley, with sinister scrutiny.
“Yes, certainly,” said Nick, without looking up. “By Jove, they
correspond perfectly, Henley. There’s no question about it.”
Nick was comparing both pieces of footwear with several impressions
found in the damp earth. There was, as he had stated, no question as to the
correspondence in size and shape, which was further evidence of who had
been there the previous evening.
“It looks bad, bad enough,” he added, after viewing the blood-spattered
bushes, the rough ground on all sides, and seeking vainly for evidence
showing in which direction Gordon had departed.
“You have made no search for the girl’s body, Henley, you said.”
“What’s the use?” Henley asked, with a growl. “A hundred to one it’s at
the bottom of the pond.”
“Very likely,” admitted Nick, with seeming uncertainty as to what course
to take.
“Gordon wouldn’t have waited to bury it.”
“True again,” Nick allowed. “If we only knew in which direction he went
——”
“We can find that out easy enough,” Henley interrupted, with eyes
gleaming for an instant.
“How so?” asked Nick, though he had expected and been only waiting for
these suggestions. “How can we contrive to trace him?”
“Leave it to Ginger.”
“You mean——”
“Ginger will show you,” Henley cut in. “He can trail him like breaking
sticks. He’s some dog. Mr. Carter, some dog. Wait a bit and I’ll show you.
Gimme one of Gordon’s shoes.”
“By Jove, that’s a good idea, Henley.” Nick cried, as if he had not thought
of it. “He can get the scent from this, perhaps, as you suggest. I ought to
have been wise to that.”
“Here you, Ginger, come here,” Henley growled harshly. “Come here,
you rascal.”
The hound bounded through the bushes and cringed at his master’s feet.
Henley seized him by the scruff of the neck and held to his nostrils the
shoe the detective had given him, then pointed to the larger of the imprints in
the ground.
“Get after him, Ginger!” he commanded, producing a leather strap and
hooking it to the dog’s collar. “Follow him up! After him, Ginger, you
rascal!”
The hound brightened up and appeared to know what was wanted. He
began to bark, until Henley cuffed him fiercely, and then he thrust his
muzzle to the ground, whining and eagerly tugging hard on the leather leash.
Henley seized his shotgun from the ground where he had placed it, crying
gruffly:
“I told you, Carter. He’s got the scent. Come on at my heels. Ginger’ll
trail him.”
“By Jove, I believe you are right, Henley,” Nick cried, following.
“I know I’m right. He’s some dog, sir, some dog.”
“Some dog, Henley, no mistake.”
“Can you stick close?”
“Bet you!” said Nick, as both plunged on after the hound. “You can’t go
too fast for me.”
“Sing out if I do.”
“I’ll hang on, all right. Want me to carry your gun?”
“Not much!” growled Henley. “I’m used to this ’ere business.”
“Gordon evidently went round the pond, instead of back to the
crossroad.”
“That’s so. He most likely was heading for the other road.”
“It looks so, for fair.”
“Ginger’ll trail him. Leave it to Ginger.”
The hound was plunging on all the while, with his muzzle to the ground,
and was shaping a course through the woods and around the south side of the
pond.
“Plainly enough, whoever planted this evidence wore the shoes Gordon
had been wearing,” thought Nick, tramping rapidly on behind Henley.
“That’s evidence enough, too, that he now is in the hands of this rascal’s
confederates. It would be like Mortimer Deland not to overlook a point as
essential as that. Where will the trail end? That’s the question.”
It then was, in fact, almost the only important question in Nick Carter’s
mind. He felt that he had a correct answer for all of the others. He was not
left long in uncertainty, however, for the trail was not a very long one.
Ten minutes brought them to a narrow road on the south side of the pond,
though a quarter mile from it, and the hound started off to the left without a
moment’s hesitation.
Another eighth of a mile brought them to what evidently was an
extensive private estate. There were low walls through the woods, and away
off to the right could be seen at intervals, when the trees and foliage did not
hide them, the white stones and monuments of a distant cemetery.
“Whose place is this, Henley?” Nick inquired, while both scrambled over
a low wall over which the hound had leaped. “Do you know who owns this
estate?”
“Sure I know,” growled Henley, over his shoulder. “I know every place in
these parts.”
“Whose is it?”
“It’s owned by a man named Barker, Colonel Morgan Barker, but he’s in
Europe with his family. The house hasn’t been open for a year.”
Nick remembered the man and the place, also the Barker tomb, in which
Mortimer Deland had temporarily concealed the art treasures stolen from
Rudolph Strickland’s flat in Fifth Avenue, and from which gruesome
confinement Nick had rescued Patsy Garvan on the night of the round-up.
No additional evidence was needed to convince him that he had hit the
nail on the head, that Pauline Perrot and Mortimer Deland were one and the
same, and that this notorious European crook was back of the knavery then
in progress.
“It’s dollars to doughnuts, now, that the rascal has taken secret possession
of Barker’s unoccupied house,” Nick said to himself. “It’s the old Barker
homestead, and sufficiently isolated to serve Deland admirably for such a
job. He knew all about it, too, and that he would ordinarily be safe from
intruders. I’ll butt in on him, now, in a way he’ll not fancy.”
The last scarce had crossed Nick’s mind when they emerged into the
cleared land back of the large old country house, stable, and outbuildings.
Ginger was still tugging on the leash and leading the way between the
buildings and toward the rear of the fine old dwelling.
Not a word now came from Henley.
Nick glanced sharply at the house while they approached it. Shutters
protected all of the lower windows. The curtains at those on the upper floors
were closely drawn. The surrounding grounds, an eighth of a mile from the
nearest road, shut in by the trees of an extensive park, were entirely deserted
and running to rank grass and weeds.
When within ten yards of the rear door, toward which the hound was
heading, Nick said abruptly:
“Stop a moment, Henley. If our man is here——”
“He’s here, Carter, all right,” Henley cut in gruffly.
He swung round while he spoke and dropped the leash, then threw his
shotgun into the hollow of his arm, instantly covering the detective.
“He’s here, Carter,” he added, with sinister significance. “Don’t you
reach for a gun. Don’t move, blast you, or I’ll pepper you so with buckshot
that you’ll look like a sieve.”
CHAPTER VIII.

FACE TO FACE.

Nick Carter’s feelings upon seeing the sudden display of animosity by


Pete Henley were not manifest in his face. He gazed at the swarthy ruffian
with hardly a change of countenance, apparently indifferent to the double-
barreled gun with which he was covered.
“What’s the joke, Henley?” he asked coolly.
The ruffian had murder in his eyes, and looked as black and threatening
as a thundercloud.
“You’re the joke, Carter, if there’s any joke to it,” he replied, with a snarl.
“You’ve barked up the wrong tree and tackled the wrong bunch. Stick up
your hands, and be quick about it.”
“Certainly, Henley, since you insist so politely,” Nick rejoined, raising his
hands as high as his head.
“Keep them there, now.”
“But you might answer my question, at least, and explain this sudden
change of attitude on your part.”
“You’ll know soon enough,” was the reply, followed by a short, sharp
whistle.
Ginger did not respond to it. He had disappeared around a corner of the
house.
Instead, the back door was quickly opened and two roughly clad men
appeared on the threshold, both still under thirty. One of them instantly
darted back through the hall, and Nick heard him shout to another in one of
the adjoining rooms.
Henley, meantime, growled harshly, with his evil eyes constantly on the
detective:
“Come out here, Foster, and get behind the dick. Feel under his coat and
get his guns. Kneel down while doing it, so I’ll not hit you. I’ll plug him, all
right, if he moves a finger.”
“There will be no occasion, Henley, you rat,” Nick now said sternly. “I
value a whole skin too highly to take any chance against that blunderbuss in
such hands as yours. I see, now, that you have served me a scurvy trick. Go
as far as you like.”
“You don’t need to tell me that,” snapped Henley. “I’m on the way. Got
’em, Bill?”
“Both of ’em, Jim,” returned Foster, who had hurriedly disarmed the
detective and was threatening him with his two weapons. “Who is he?”
“Nick Carter.”
“Thunder! Where did you run up against him? If he——”
“You’re to bring him in, Jim,” cut in the man who had briefly vanished,
and now returned to the open door. “His jags says——”
“Is he out here, Brigham?” Henley interrupted, with countenance
clearing.
“Sure. Been here ten minutes.”
“That’s more like it,” cried Henley. “He can now take the ribbons. Get a
move on, Carter, and—stop a bit!”
Nick halted.
“Feel again, Foster, and fish out his irons. Snap them on his own wrists,
hands behind him, as he will on ours if he gets a chance.”
“You’ve told the truth once, Henley, at least,” Nick put in dryly.
“But you’ll never get the chance,” Henley retorted. “Dukes down and
behind you, Carter, or I’ll pull the trigger.”
“Don’t trouble yourself,” said Nick, obeying. “Point the gun another way.
It might go off by chance.”
Henley heard the snap of handcuffs around Nick’s wrists and saw Foster
straighten up after having secured him, and he then lowered the shotgun and
grinned maliciously.
“You thought you were the real thing, didn’t you, Carter?” he demanded.
“Get a move on and I’ll show you what you’re up against and where you
stand.”
“I can guess.”
“Into the shack, and no funny business, mind you, or you’ll hear
something drop, if you live until you hit the floor. Lead the way, Brigham.
Where’s his jags?”
“In the dining room, Jim.”
“Head that way. Plug along, Carter, where he leads.”
Nick felt the prod of the ruffian’s gun in the small of his back, but he had
no intention of offering any objection. He followed Brigham into the house,
a stocky, ill-favored fellow with fiery-red hair, and in another moment he
heard the door closed and locked behind him.
The hall was dim when the sunlight was thus excluded. It ran straight
through the spacious old colonial house to the front door. A broad, but
angular stairway led up to the second floor. There was a damp and musty
smell in the long-closed dwelling, and the rooms on each side of the broad
hall looked dusty, gloomy, and deserted.
The exception, in the last respect, was the large dining room into which
the detective was conducted by the three crooks.
That room contained only one occupant, however; the man in search of
whom Chick Carter had left the Gordon residence more than an hour before
—Mr. Edgar Hereford Dayton.
He was seated in one of the leather upholstered chairs, pushed back from
the polished table. He did not appear disturbed by what had occurred or by
the advent of the detective upon the scene, though he gazed at Nick
curiously when he entered, flecking the ashes from the end of a cigarette.
His overcoat and hat were lying on a chair near the wall, and near it stood
a closed leather suit case.
Nick Carter identified him instantly as Dayton—and somewhat more than
that when he spoke.
Henley was the first to open fire, however, addressing Dayton and saying
gruffly, the moment he entered:
“You’d better clean out that town office, old sport, or fight shy from it
now on. I reckon this dick has sent his right bower to keep an eye on it.
Leastwise, I don’t see where else he would have sent him in such a rush.”
Nick suppressed a smile. It amused him to find that Henley was a bit
more discerning than he had thought him.
Dayton appeared unmoved by Henley’s announcement and advice. He
glanced at the suit case mentioned, then responded with a curious mingling
of coolness and assurance that Nick was quick to remember:
“He is welcome, Henley, to inspect that office. It already is cleaned out of
all that would interest him. Suppose, instead of giving me needless advice,
you tell me just what this meddlesome fellow is after, and what he has been
doing.”
“By Jove, I’m not mistaken,” was the thought then in Nick’s mind. “This
rascal has even more strings to his bow than I suspected.”
“That’s quickly told——” Henley began to reply.
“But better told first hand,” Nick cut in curtly, with his gaze intently fixed
on the man he addressed. “I’ll give you the information you want. I’ll tell
you what I’m after and what I’ve been doing.”
“Ah!” Dayton spoke with an icy drawl. “Better first hand, indeed, as you
say. I do not yet place you, however, nor——”
“Oh, a truce to subterfuge,” Nick again interrupted curtly.
“Subterfuge?”
“You know me perfectly well—but not better than I know you.”
“Indeed?”
“You place me, all right, as I sooner or later will again place you where
you belong.” Nick went on sternly, disregarding the other’s queries. “A wig,
a beard, a reverse curve of the eyebrows, a more florid skin, an altered voice
—it takes more than those to blind me, though you might get by others. Fly
your true colors, Mr. Mortimer Deland, and I’ll tell you what I am after and
what I’ve been doing.”
“Ah! That is a great inducement, so great that I find myself utterly unable
to resist it.”
Deland replied with unruffled composure. He drew up a little in his chair,
gazed steadily at the detective for a moment, then raised his slender white
hands to his head, deftly removing the exceedingly artistic disguise which
Nick alone had been able to penetrate, and which had fairly transfigured the
mobile, sinister, clean-cut, yet strangely effeminate features of—Mortimer
Deland.
Jim Henley and the two frowning crooks near by evinced no surprise nor
made any comments. That Deland was the master, and they merely hirelings,
was perfectly apparent to the detective.
It appeared obvious, too, that Chick Carter must have arrived too late to
have picked up the supposed Dayton before he left his office—a mischance
that would seem to have badly aggravated the present desperate situation of
the detective.
Deland appeared to think so, too, for he smiled with vicious complacency
while he tossed his disguise upon the table, saying with the same frigid voice
and insolent assurance which was so characteristic of him that they had at
once betrayed him to the detective:
“Now, having met you halfway, Carter, and complied with the stipulation
you imposed, it is up to you to perform your part of the brief verbal contract.
Sit down, if you prefer; there are plenty of chairs. I regret that I cannot
release you, but that would be injudicious for obvious reasons. Tell me, now,
as you promised, what are you after and what have you been doing, that my
good friend Henley has rounded you up in this fashion?”
CHAPTER IX.

THE ACME OF KNAVERY.

Nick Carter ignored Mortimer Deland’s mocking suavity, the miscreant’s


manifest air of superiority and contempt. He sat down directly opposite the
notorious crook, replying sternly:
“That may be quickly told, Deland, and I’m right here to tell it.”
“I am listening.”
“You wish to know what I am after. I am after a rascal who has been
playing a very extraordinary game, so extraordinary that he might have won
out and accomplished his evil designs—if I had not butted into the game to
thwart it.”
“Ah!” drawled Deland. “That makes it very unfortunate for him—but
doubly unfortunate for you, perhaps.”
“That last word is well added.”
“Indeed?”
“You will agree with me later.”
“I seldom agree with men of your vocation,” said Deland, smiling
ironically. “Be good enough to explain, Mr. Carter. I do not quite get you.
For whom are you seeking?”
“For Pauline Perrot—said to have been murdered by Arthur Gordon,”
Nick replied curtly.
“Dear me, is that so?” smiled Deland, with eyes narrowing. “I remember
Gordon. It was he who started you on my track several months ago, with
very disastrous results. I would not grieve deeply, Carter, if evil did befall
Mr. Arthur Gordon.”
“I am very well aware of that, Deland,” Nick said dryly. “Your assurance
of it is entirely unnecessary.”
“Pauline Perrot, eh?” queried Deland, unruffled. “Said to have been
murdered. She is Gordon’s stenographer, I believe. I think I have seen her
coming from his business office. Murdered, eh? What are the circumstances,
Carter? Have you succeeded in finding her—or what is left of her?”
“Yes,” Nick said shortly.
“Dear me, is that so?”
Deland did not, in fact, then suppose it was so, Henley being the only one
of the four crooks then informed of what the detective had discovered.
“I not only have found all that is left of her, but also all that she left
behind her,” Nick pointedly added.
Deland’s eyes took on a sharper gleam and glitter, his thin lips a more
sinister and threatening curve. The tinge of color in his cheeks waned
perceptibly. His long, slender fingers closed involuntarily, until their
carefully manicured nails bit into his palms. He laughed, nevertheless, in a
cold and mirthless fashion, while he echoed inquiringly:
“All that she left behind her?”
“Exactly,” said Nick.
“You mean——”
“The garments she left in the home of Mrs. Lord, with whom she has
been boarding.”
“You have been there?”
Deland’s brows knit closer and fell to a settled frown over his steadily
dilating eyes.
“How else could I have found the garments?” Nick demanded. “Yes, I
have been there and——”
“And that’s not the only place he’s been to, nor all he——”
“One moment, Henley,” Deland coldly interrupted. “I will hear you
presently. Permit Mr. Carter to have his say. What more, Carter; what
more?”
“Oh, there is a good deal more, Deland, if I chose to tell you all of my
discoveries and deductions,” Nick now said, more sternly.
“Ah, indeed?”
“So much, Deland, that it would reveal in every detail the knavish game
you have been playing,” Nick went on forcibly. “But you have overplayed
yourself, over-estimated the value of your cards.”
“My cards?”
“Have you not learned in all the years you have lived in vice and crime
that three kings, well played, will invariably beat three knaves?”
“See here, Carter——”
“Oh, you wanted me to have my say,” Nick went on sternly, interrupting.
“The three kings you have been up against, Deland, are Patsy Garvan, Chick
Carter, and myself—three kings in the detective deck. You, Deland, are
single-handed the three knaves—yourself, the man Dayton, and the
supposed murdered girl, Pauline Perrot. Three knaves, Deland, never beat
three kings.”
“You say—you say that I am Pauline Perrot?” gasped Deland, with his
wonderful nerve shaken for the first time.
“I not only say so, but I can also prove it,” cried Nick. “I say, too, that
you now have Arthur Gordon confined in this house, and that you and these
three rascals——”
“Stop!” Deland leaped to his feet. “I have heard enough from you, Carter.
Keep an eye on him, Foster, with a weapon ready. If he utters another word,
or makes an aggressive move, shoot him instantly. This way, Henley, into the
hall. I prefer to hear your story.”
An expression of devilish ferocity now had settled upon his vicious white
face. He strode into the hall, Henley following, and for several minutes the
two remained there in a whispered discussion.
Nick Carter waited with apparent indifference.
“There soon will be something doing, I imagine,” he said to himself. “I
wonder whether Chick arrived in time to pick up his quarry. That now
appears very improbable. Fortunately, however, I have another string to my
bow, one that Henley does not even suspect. The odds are considerable, but
—ah, well, I have never known him to fail to make good.”
There was a still more vicious look on Deland’s face when he returned
with Henley. It was like that which it had worn when, having caught Patsy
Garvan as he now had cornered Nick, he left him to die in the Barker tomb.
He came and stood directly in front of Nick, gazing down at him and
saying, with icy severity:
“Henley has made it perfectly plain to me. There is no occasion for you to
say more.”
“Very well,” Nick returned indifferently.
“You are very clever, Carter, very clever,” Deland went on. “I have never
in Europe encountered an inspector who compared at all with you. You are
so dangerous, Carter, that the world is too small for both of us.”
“Why don’t you move out?” Nick coolly inquired.
“You have exposed my game, indeed, and thwarted part of it,” Deland
went on, as if there had been no interruption. “But I have, at least, the money
and bonds stolen from Gordon’s vault. They are in yonder suit case.”
“Thanks for the information,” Nick again put in. “It will save me from
searching for them.”
“I also have Gordon, here, as you have inferred,” continued Deland icily.
“And, best of all—I have you!”
“I would be foolish to deny it,” Nick dryly allowed.
“And here, Carter, before we bolt for parts unknown, is where I shall get
even with you and with him, where I will forever wipe you out of my path.
Gordon is bound hand and foot in a room on the top floor.”
“Thanks again, Deland.”
“I will send you both to the devil.”
“By what route, pray?”
“In a chariot of fire!” cried Deland, with a sudden outburst of ferocity.
“Well, well, that will beat walking,” Nick declared, not in the least
daunted by the significance of the miscreant’s threat.
Deland swung around to Foster and Gribham, who had stood listening
with stoical indifference to the foregoing colloquy.
“Go and get him, you two,” he fiercely commanded. “Bring Gordon
down here. We will wipe them out together. We will leave no evidence here
to tell the story. We will bind both, lock them in the library closet, and then
fire the house.”
“That’s the stuff!” Henley said, with a growl. “It will burn like tinder.
That will finish them.”
“Get Gordon—get Gordon!” Deland fairly shrieked. “Bring him to the
library. We can be out of here with our plunder, with the deed done, in less
than a dozen minutes. Go and get Gordon. Bring Carter after me, Henley.
Bring him into the library. I’ll do it—I long to do it! It shall be my hand that
starts the flames!”
In another moment all of them, Nick Carter included, were striding into
the dimly lighted hall.
CHAPTER X.

THE OTHER STRINGS.

Patsy Garvan did not ride far with Danny Maloney after their parting
from Nick Carter and Henley. Glancing back over his shoulder, Patsy waited
only until they had rounded the curve in the road, when he called quickly:
“Slow down, Danny, and drop me. We’re out of sight.”
Danny obeyed at once, saying regretfully:
“Gee! I wish I was going with you. I might be needed.”
“One is better than two,” Patsy replied, leaping down to the road.
“There’s only half the risk of being seen. I can fill the bill, all right, single-
handed.”
“So long, then, and good luck.”
“Same to you.”
Danny sped on with the car.
Patsy Garvan, however, plunged into the woods, at once shaping a course
that would bring him in sight of the crossroad through which Nick and
Henley were to pass.
It was to enable Patsy to make this detour that Nick repeatedly stopped on
the road, pretending he wanted to find footprints left by the missing couple.
Patsy accomplished the move with no great difficulty, and entirely
unsuspected by Henley, owing to the artful attitude toward him that Nick had
assumed.
Patsy saw them pass along the road; in fact, saw them on the edge of the
pond, and then he followed them at a discreet distance until, from behind
one of the outbuildings, he saw Nick held up by Henley and afterward taken
into the house.
“Gee! that does settle it,” he said to himself. “I must know who is there
and what’s going to come off, but it won’t do for me to approach the house
from this side. Those rats are in the rear rooms, or a side one, or they could
not have reached the back door so quickly after Henley whistled. I’ll make a
circuit to the front road and have a look.”
It took Patsy several minutes to do so, seeking the shelter of a wall over
which he could plainly see the front of the dwelling, and he then met with an
agreeable surprise.
A familiar whistle fell upon his ears, and he turned and discovered Chick
under the same wall.
“Gee whiz!” he exclaimed, when they met. “This is dead lucky, for fair.”
“It’s not all luck, I guess,” Chick replied. “Give the chief the credit for
it.”
“You found your man?”
“I arrived just in time to see him leaving his office.”
“He must be out here, now, since you are here.”
“That’s what,” Chick nodded. “He went round to the back door of the
house about ten minutes ago. I’ve been waiting and watching till I could get
a line on what’s going on in there.”
“Gee! I can supply that line, all right,” chuckled Patsy.
“Cut loose, then,” said Chick.
Patsy informed him with very few words what had occurred, and the
subterfuge Nick had employed.
“It now is up to us, Chick,” he added. “The gang we want is in that house,
and probably Arthur Gordon. We must go in and get them. There’s nothing
else to it.”
“Only one thing,” corrected Chick, who again was sizing up the house.
“What’s that?”
“The way to get in, Patsy, so as to catch them hands down. It’s a hundred
to one that they are on the ground floor, also in one of the rear rooms, as you
have said.”
“It’s a safe gamble, Chick, in my opinion.”
“And I am equally sure that we could not force any of the lower windows
without being heard. We can take a chance and approach the front of the
house, and by climbing that trellis at the east end of the veranda, we can
reach the veranda roof and three of the second-floor windows.”
“Like breaking sticks,” nodded Patsy approvingly. “It’s dollars to
doughnuts that we then can quietly force one of the windows.”
“I think so, too.”
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

textbookfull.com

You might also like