Instant download Doing Data Science in R An Introduction for Social Scientists 1st Edition Mark Andrews pdf all chapter
Instant download Doing Data Science in R An Introduction for Social Scientists 1st Edition Mark Andrews pdf all chapter
com
https://textbookfull.com/product/doing-data-science-in-r-an-
introduction-for-social-scientists-1st-edition-mark-andrews/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/human-environment-interactions-an-
introduction-mark-r-welford/
textboxfull.com
https://textbookfull.com/product/interpretive-social-science-an-anti-
naturalist-approach-mark-bevir/
textboxfull.com
https://textbookfull.com/product/turbulence-an-introduction-for-
scientists-and-engineers-davidson/
textboxfull.com
An Introduction to Psychological Science, Second Canadian
Edition Mark Krause
https://textbookfull.com/product/an-introduction-to-psychological-
science-second-canadian-edition-mark-krause/
textboxfull.com
https://textbookfull.com/product/an-introduction-to-psychological-
science-third-canadian-edition-mark-krause/
textboxfull.com
https://textbookfull.com/product/r-for-data-science-1st-edition-
garrett-grolemund/
textboxfull.com
https://textbookfull.com/product/sas-for-r-users-a-book-for-budding-
data-scientists-first-edition-ohri/
textboxfull.com
https://textbookfull.com/product/r-programming-for-data-science-1st-
edition-roger-peng/
textboxfull.com
DOING DATA SCIENCE IN R
DOING DATA SCIENCE IN R
An Introduction for Social Scientists
Mark Andrews
Los Angeles
London
New Delhi
Singapore
Washington DC
Melbourne
SAGE Publications Ltd
1 Oliver’s Yard
55 City Road
London EC1Y 1SP
SAGE Publications Inc.
2455 Teller Road
Thousand Oaks, California 91320
SAGE Publications India Pvt Ltd
B 1/I 1 Mohan Cooperative Industrial Area
Mathura Road
New Delhi 110 044
SAGE Publications Asia-Pacific Pte Ltd
3 Church Street
#10-04 Samsung Hub
Singapore 049483
© Mark Andrews 2021
Apart from any fair dealing for the purposes of research or private
study, or criticism or review, as permitted under the Copyright,
Designs and Patents Act, 1988, this publication may be reproduced,
stored or transmitted in any form, or by any means, only with the
prior permission in writing of the publishers, or in the case of
reprographic reproduction, in accordance with the terms of licences
issued by the Copyright Licensing Agency. Enquiries concerning
reproduction outside those terms should be sent to the publishers.
Library of Congress Control Number: 2020945072
British Library Cataloguing in Publication data
A catalogue record for this book is available from the British Library
ISBN 978-1-5264-8676-9
ISBN 978-1-5264-8677-6 (pbk)
Editor: Aly Owen
Assistant editor: Lauren Jacobs
Production editor: Ian Antcliff
Copyeditor: QuADS Prepress Pvt Ltd
Proofreader: Neville Hankins
Marketing manager: Ben Griffin-Sherwood
Cover design: Shaun Mercier
Typeset by: C&M Digitals (P) Ltd, Chennai, India
Printed in the UK
At SAGE we take sustainability seriously. Most of our products are
printed in the UK using responsibly sourced papers and boards.
When we print overseas we ensure sustainable papers are used as
measured by the PREPS grading system. We undertake an annual
audit to monitor our sustainability.
CONTENTS
About the Author
Online Resources
1 Data Analysis and Data Science
Part I Fundamentals of Data Analysis and Data Science
2 Introduction to R
3 Data Wrangling
4 Data Visualization
5 Exploratory Data Analysis
6 Programming in R
7 Reproducible Data Analysis
Part II Statistical Modelling
8 Statistical Models and Statistical Inference
9 Normal Linear Models
10 Logistic Regression
11 Generalized Linear Models for Count Data
12 Multilevel Models
13 Nonlinear Regression
14 Structural Equation Modelling
Part III Advanced or Special Topics in Data Analysis
15 High-Performance Computing with R
16 Interactive Web Apps with Shiny
17 Probabilistic Modelling with Stan
References
Index
ABOUT THE AUTHOR
Mark Andrews (PhD)
is Senior Lecturer in the Department of Psychology at Nottingham
Trent University. There, he specializes in teaching statistics and data
science at all levels from undergraduate to PhD level. Currently, he is
the Chair of the British Psychological Society’s Mathematics,
Statistics, and Computing section. Between 2015 and 2018, Dr
Andrews was funded by the UK’s Economic and Social Research
Council (ESRC) to provide advanced training workshops on Bayesian
data analysis to UK-based researchers at PhD level and beyond in
the social sciences. Dr Andrews’ background is in computational
cognitive science, particularly focused Bayesian models of human
cognition. He has a PhD in Cognitive Science from Cornell University,
and was a postdoctoral researcher in the Gatsby Computational
Neuroscience Unit in UCL and also in the Department of Psychology
in UCL.
ONLINE RESOURCES
Figure 1.1 The data science workflow. Raw data is usually messy and not yet
amenable to analysis of any kind. Data wrangling takes the raw data and transforms
it into a new tidy format. This data is then explored and visualized in an iterative
manner, which may also include some further wrangling. This eventually leads to
probabilistic modelling, which itself involves an iterative process of statistical
inference and model evaluation. Finally, we communicate our results in articles,
presentations, webpages, etc.
As we use the term throughout this book, data science is a set of interrelated
computational or mathematical methods and tools that are used in the general data
analysis workflow that we outline in Figure 1.1. This workflow begins with data in its
nascent and raw form. Raw data is usually impossible or extremely difficult to work
with, even casually or informally. The process of transforming the data so that it
amenable to further analysis is data wrangling, and the resulting data sets are said to
be tidy. This data can then be explored and visualized. We view data exploration and
data visualization as ultimately accomplishing the same thing. One usually involves
quantitative descriptive analysis, while the other involves graphical analysis, but both
aim to discover potentially interesting patterns and behaviours in the data. The
exploratory analysis stage then leads us to posit a tentative probabilistic model of the
data. Put more precisely, it leads us to posit a tentative probabilistic model of the
phenomenon that generated the data. Inevitably, this model involves unknown
variables that must be inferred using statistical inference. This leads to a fitted
model, which may be then evaluated and possibly extended and modified, thus
leading to further inference. Eventually, we communicate our results in reports,
presentations, webpages, etc.
Each of the stages of this data science workflow involves computational and
mathematical concepts and methods. In fact, it is this combination of the
computational and the mathematical or statistical that is a defining feature or key
characteristic of data science as we conceive of it. Without using computers, and thus
performing any stages of the workflow manually in some manner, only practically
trivial types of analysis could be accomplished, and even then the analysis would be
laborious and error prone. By contrast, the more proficient we are with the relevant
computational tools, the more efficient and sophisticated our analyses can be. In this
sense, computing skills, specifically reading and writing code, are integral and vital
parts of modern data analysis. These cannot be generally sidestepped or avoided by
using graphical user interfaces (GUIs) to statistics programs. While programs like
these are sometimes suitable for novices or for casual use, they are profoundly
limited and inefficient in comparison to writing code in a high-level programming
language.
In addition to computing tools, many of the stages of the data science workflow
involve mathematical and statistical concepts and methods. This is especially true of
the statistical modelling stage, which requires a proper understanding of
mathematical and probabilistic models, and related topics such as statistical
inference. Simply being able to perform a statistical analysis computationally,
accompanied by a vague and impressionistic understanding of what the analysis is
doing and why it is doing it, will not generally be sufficient. Without a deeper and
theoretical understanding of probabilistic models, statistical inference, and related
concepts, we will not be able to make principled and informed choices concerning
which models to use for any given problem. Nor would we be able to understand the
meaning of the results of the inference, and we would be limited or mistaken in the
practical and scientific conclusions that we make when we use these models for
explanation or prediction. Moreover, statistical models of the kind that we cover in
this book should not be seen as a list of independent tools in a big toolbox, each one
designed for a different task or application, and each with its own rules and
principles. Rather, more generally, we should view statistical modelling as a
systematic framework, or even a language, for building mathematical models of
scientific phenomena using observed data. While we may talk about normal linear
models, or zero-inflated Poisson models, etc., these are just examples of the infinitely
many models that we can build to model the scientific problem at hand. Being aware
of statistical modelling as a flexible and systematic framework that is based on
pragmatic and theoretical principles allows us to more competently and confidently
perform statistical analysis, and also greatly increases the range and scope of the
analyses that are readily available to us.
1.2 WHAT IS DATA SCIENCE?
Even if we accept the nature and the value of the data analysis workflow that we’ve
just outlined, it is reasonable to ask whether it should properly be called ‘data
science’. Is this not just using a new word, even a buzzword, in place of much more
established terms like ‘statistics’ or ‘statistical data analysis’? We are using the term
‘data science’ rather than ‘statistics’ per se or some variant thereof because data
analysis as we’ve outlined it arguably goes beyond the usual focus of statistics, at
least as it is traditionally understood. Mathematical statistics as a scientific or
mathematical discipline has focused largely on the statistical modelling component of
the programme we outlined above. As we’ve hopefully made clear, this component is
of profound importance, and in fact we would argue that it is the single most
important part and even ultimate goal of data analysis. Nonetheless, in practice, data
wrangling alone occupies far more of our time and effort in any analysis, and
exploration and visualization should be seen as necessary precursors to, and even
continuous with, the statistical modelling itself. Likewise, traditional statistics often
marginalizes the practical matter of computing tools. In statistics textbooks, even
excellent ones, for example, code examples may not be provided for all analyses, and
the code may not be integrated tightly with the coverage of the statistical methods.
In this sense, traditional statistics does not thoroughly deal with all the parts of the
data analysis workflow that we have outlined. This is not a criticism of statistics, but
just a recognition of its particular focus.
This general point about real-world data analysis being more than just the traditional
focus of mathematical statistics was actually made decades ago by Tukey (1962).
There, Tukey, who was one of the most influential statisticians of the twentieth
century and a pioneer of exploratory data analysis and data visualization, preferred
the term data analysis as the general term for what he and other statistical analysts
actually do in practice. For Tukey, inferential statistics and statistical modelling were
necessary and vital, but only as a component of a much bigger and multifaceted
undertaking, which he called ‘data analysis’.
While the general spirit of the argument about the breadth and scope of data
analysis that Tukey (1962) outlined is very much in keeping with the perspective we
follow here, modern data analysis has a character that goes beyond Tukey’s vision,
however broad and comprehensive it was. This is due to the computing revolution.
For example, when Tukey was writing in the early 1960s, the world’s fastest
computers1 were capable of around 1 million calculations per second. Approximately
60 years later in 2020, the world’s fastest computer2 is capable of around 500
quadrillion (5 × 1017) calculations per second, and a typical consumer desktop can
perform hundreds of billions of calculations per second. This revolution has
transformed all aspects of data analysis, and now computing is as vital and integral a
part of data analysis as are mathematics and statistics. It was largely the recognition
of the vital and transformative role of computing that lead Cleveland (2001) to coin
the term ‘data science’. As we use the term, therefore, data science is the blend of
computational and statistical methods applied to all the aspects of data analysis.
1 https://en.wikipedia.org/wiki/Atlas_(computer)
2 https://en.wikipedia.org/wiki/Fugaku_(supercomputer)
Even if we accept that the defining feature of data science is generally the combined
application of computational tools and statistical methods to data analysis, the term
‘data science’ has some popular connotations that are somewhat at odds with the
more general understanding of the term that we are following in this book. In
particular, for some people, data science is all about concepts like predictive analytics,
big data, machine learning, deep learning, data mining of massive unstructured data
sets including natural language corpora. It is seen as largely a branch of computer
science and engineering, and is something that is done in the big tech companies like
Google, Amazon, Facebook, Apple, Netflix and Twitter. It is absolutely true that data
science, particularly as it is practised in industry, heavily avails tools like machine
learning and big data analysis software, and is applied to the analysis of massive
unstructured data sets. As real and important as these activities are, we see them
here as just one application of data science as we generally understand the term.
Also, in this particular application of data science, some topics and issues take
precedence over or dominate others. For example, in these contexts, the software
and hardware problems of being able to analyse data that is on such a large scale
are the major practical issues. Likewise, for some applications, being able to perform
successful predictions using statistical methods is the only goal, and so the assumed
statistical models on which these predictions are based are less important or even
irrelevant (see Breiman, 2001, for a well-known early discussion of these two
different general ‘cultures’ of using statistical methods).
In summary, in this book, we use the term ‘data science’ as the general term for
modern data analysis, which is something that always involves a tight integration of
computational and statistical methods and tools. In this, we are hopefully faithfully
following the broad and general understanding of what real-world data analysis
entails as described by Tukey (1962), albeit with the additional vital feature of
intensive use of computational tools. In some contexts, data science has a more
particular focus on big data, data mining, machine learning and related concepts.
That particular focus is not the focus of this book, and so this book is probably not
ideal for anyone keen to learn more about data science in just this sense of the term.
1.3 WHY R, NOT PYTHON?
We have stated repeatedly that computational methods and tools are vital for doing
data science. In this book, the computing language and environment that we use is
the R Project for Statistical Computing, simply known as R. More specifically, we use
the modern incarnation of R that is based on the so-called tidyverse. In Chapter 2 we
provide a proper introduction to R. Here, we wish to just outline why R is our choice
of language and environment, what the alternatives are, and what this entails in
terms of the our conception of what data science is and how it is practised.
Given our conception of the data science workflow that we outlined in Figure 1.1, R is
an inevitable choice. We believe that R is simply the best option for performing all the
major components that we outline there. For example, for the data wrangling
component, which can be extremely laborious, R packages that are part of the
tidyverse such as readr, dplyr and tidyr, which we cover in Chapter 3, make
data wrangling fast and efficient and even pleasurable. For data visualization, the
ggplot2 package provides us with essentially a high-level and expressive language
for data visualization. For the statistical modelling loop, which we cover in all the
chapters of Part II of this book, R provides a huge treasure trove of packages for
virtually every conceivable type of statistical methods and models. Also, R is the
dominant environment, using packages like rstan and brms, for doing Bayesian
probabilistic modelling using the Stan probabilistic programming language. We cover
Bayesian models throughout the chapters of Part II. For communication, R provides
us with the ability to produce reproducible data analysis reports using RMarkdown,
Knitr and other tools, which we describe in Chapter 7.
Everything we cover in this book could be done using another programming
language, or possibly using some set of different languages. Chief among these
alternatives is Python. Python is close to being the most widely used general-purpose
programming language of any kind. It has been very popular for almost two decades,
and its dominance and popularity have been increasing in recent years. Moreover,
one of Python’s major domains of application is data science, with some arguing that
it should preferred over R for data science generally. For the big data, big tech, data
mining, machine learning sense of data science that we mentioned above, Python
certainly ought to be the dominant choice over R. This is for multiple reasons. First,
Python is now the principal computing language for doing machine learning, deep
learning and related activities. Also, because Python is a general-purpose
programming language and one that is widely used on the back end of web
applications, this makes integrating data science tools with the ‘production’ web
server software much easier and scalable. Likewise, Python is a very powerful and
well-designed general-purpose programming language, which means that it is easier
to write complex highly structured software applications in Python than in a
specialized language like R. This again facilitates the integration of Python data
science tools with production- or enterprise-level software applications. Nonetheless,
for the more general conception of data science that we are following in this book,
Python is more limited than R. For example, for data wrangling of typical rectangular
data structures, Python’s pandas package, as excellent as it is, is not as high level
and expressive as R’s tidyverse-based packages like dplyr and tidyr. This entails
that wrangling data into shape in R can be easier and involve less lower-level
procedural and imperative code than when using Python. Likewise for data
visualization, Python’s matplotlib package is very powerful but is also lower-level
than ggplot2. This entails that relatively complex visualization requires considerably
more procedural or imperative code, which is harder and slower to read and write
than using the more expressive high-level code of ggplot2. The higher-level
counterpart of matplotlib is seaborn, which is excellent, but seaborn is less
powerful and extensive in terms of its features than ggplot. For statistical
modelling, at the moment, there frankly is no competition between R and Python.
Even though Python has excellent statistics packages like statsmodels, these
provide only a fraction of the statistical models and methods that are available from R
packages. Finally, although dynamic notebooks like Jupyter3 are widely used by
Python users, and are excellent too, it is not as easy to create reproducible reports,
for example for publication in scientific journals, using Jupyter as it is using
RMarkdown and knitr. In fact, currently the easiest way to write a Python-based
reproducible manuscript is to use Python within R using the reticulate package.
3 https://jupyter.org/
1.4 WHO IS THIS BOOK FOR?
As mentioned at the start of this chapter, the prototypical audience at whom this
book is aimed are those engaged in data analysis in scientific research, specifically
research at or beyond PhD level. In scientific research, statistics obviously plays a
vital role, and specifically this is based on using data to build and interpret statistical
or probabilistic models of the scientific phenomenon being studied. This book is
heavily focused on this particular kind of statistical data analysis. As we’ve
mentioned, in data science as it is practised in industry and business, often the other
‘culture’ of statistics (see Breiman, 2001), namely predictive analytics and algorithms,
is the dominant one, and so this book is not ideal for those whose primary data
science interests are of that kind.
We’ve explicitly stated that this book is intended for those doing research in the
social sciences, but this also requires some explanation. The explicit targeting of the
social sciences is largely just to keep some focus and limits to the sets of examples
that are used throughout the book. However, beyond the example data sets that are
used, there is little about this content that is of relevance to only those doing
research in social science disciplines. All the content on data wrangling, exploration
and visualization, statistical modelling, etc., is hopefully just as relevant to someone
doing research in some field of biology as it is to someone doing research in the
social sciences. The nature of the data in terms of its complexity, and the nature of
the analysis of this data using complex statistical models, are traditionally very similar
in biology and social sciences. In fact, the statistics practised in all of these fields has
all arisen from a common original source, particularly the early twentieth-century
pioneering work of R. A. Fisher (e.g. Fisher, 1925).
We assume that the readers of this book will already be familiar with statistics to an
extent. For example, we assume that they’ve taken undergraduate-level courses
introducing statistics as it is used and applied in some discipline of science. We will
present the statistical methods that we cover from a foundational perspective, and so
not assume that readers are already confident and familiar with the fundamental
principles of statistical inference and modelling. However, we do assume that they
will have already had an introduction to statistics so that concepts like the ‘normal
distribution’, ‘linear regression’ and ‘confidence intervals’ will be relatively familiar,
even if they don’t have a very precise grasp of their technical meaning. On the other
hand, we do not assume any familiarity with any computing methods, nor R in
particular. In fact, we assume that many readers will be brand new to R.
1.5 THE STYLE AND STRUCTURE OF THIS BOOK
Apart from this brief introductory chapter, all the remainder of the book is a blend of
expository text, R code, mathematical equations, diagrams and R-based plots. It is
intended that people will read this book while using R to execute all the code
examples and so produce all the results that are presented either as R output or as
figures. Of course, if readers wish to read first and then run the code later, perhaps
on a second reading, that is entirely a matter of preference. However, all the code
that we present throughout this book is ready to run, and does not require anything
other than the R packages that are explicitly mentioned in the code and the relevant
data sets, which are all available on the website that accompanies the book.
The book is divided into three parts. Part I is all about the parts of the data science
workflow shown in Figure 1.1 except for the statistical modelling loop part. Thus, in
Part I we provide a comprehensive general introduction to R, a chapter on data
wrangling using dplyr, tidyr, etc., a chapter on data visualization, and another on
data exploration. We then go into more detail about programming in R, and conclude
Part I with a chapter on doing reproducible data analysis using tools like RMarkdown
and Git. Part II of the book, which is the largest part, is all about the statistical
modelling loop part of the data science workflow. There, we provide a general
introduction to statistical inference, and then cover all the major types of regression
models, specifically normal linear regression, generalized linear models, multilevel
models, nonlinear regression, and path analysis and related models. In Part III,
which is the shortest part, we cover some specialized topics that are not necessarily
part of the statistical modelling topics, but not general or introductory either.
Specifically, in Part III, we provide an introduction to using R for high-performance
computing, making interactive graphics web apps using Shiny, and a general
introduction to Bayesian probabilistic programming using Stan.
PART I FUNDAMENTALS OF DATA
ANALYSIS AND DATA SCIENCE
PART I CONTENTS
Chapter 2: Introduction to R 11
Chapter 3: Data Wrangling 51
Chapter 4: Data Visualization 101
Chapter 5: Exploratory Data Analysis 153
Chapter 6: Programming in R 185
Chapter 7: Reproducible Data Analysis 221
2 INTRODUCTION TO R
What is R, and why should we use it? 12
A power tool for data analysis 12
Open source software 13
Popularity 14
Installing R and RStudio 14
Installing R 15
Installing RStudio Desktop 15
Guided Tour of RStudio Desktop 16
RStudio menus 18
First steps in R 19
Step 0: Using the R console 20
Step 1: Using R as a calculator 20
Step 2: Variables and assignment 23
Step 3: Vectors 24
Step 4: Data frames 32
Step 5: Other data structures 33
Step 6: Functions 36
Step 7: Scripts 37
Step 8: Installing and loading packages 40
Step 9: Reading in and viewing data 44
Step 10: Working directory, RStudio projects, and clean workspaces 47
2.1 WHAT IS R, AND WHY SHOULD WE USE IT?
While there are many ways of defining what R is, for most practical purposes, it is
sufficient to describe R simply as a program for doing statistics and data analysis. If
you’ve done any kind of statistics or data analyses, the chances are extremely high
that you’ve used some computer program to do so. The range of such programs is
large. They include SPSS, SAS, Stata, Minitab, Python, Matlab, Maple, Mathematica,
Tableau, Excel, SQL, and many others. These do not all do the same thing, and so
are not necessarily interchangeable. Some, like Python, are general-purpose
programming languages that have become widely used for data science. Others, like
SQL, are database language. SPSS is primarily a GUI program for statistics, originally
targeted at researchers in the social sciences. R can be seen as just another program
in this large and heterogeneous list. The advantages of R, however, which set it apart
from many other programs, boil down to three interrelated factors: it is immensely
powerful, it is open source, and it very (and increasingly) widely used. Let us now
consider each of these three points further.
A power tool for data analysis
The range and depth of statistical analyses and general data analyses that can be
accomplished with R are immense:
Built into R’s standard set of packages is virtually the entire repertoire of widely
known and used statistical methods. These include general and generalized
linear regression analyses (which themselves include analyses of variance, t-tests
and correlations), descriptive statistical methods and nonparametric methods.
Also built into R is an extensive graphics library (see the graphics package,
which is usually termed the ‘base R’ plotting package) for doing virtually the
entire repertoire of statistical plots and graphics, and these graphics tools can be
combined programmatically to lead to any desired plot or visualization.
In addition to its built-in tools, R has a vast set of add-on or contributed
packages. There are presently over 16,000 additional contributed packages (to
be precise, there are 16,105 packages as of 12 August 2020). While they differ
in size, each one will usually provide at least dozens of additional tools and
methods for statistics, data manipulation and processing, or graphics. Some of
these packages could be described as almost mini-languages in themselves. For
example, and as we’ll see below, the package ggplot2 is effectively a mini-
language for data visualization, while packages like dplyr and tidyr are
effectively mini-languages for data wrangling and manipulation. In addition,
because R is the de facto standard computing platform for the discipline of
statistics, almost every new or existing statistical technique developed by
statisticians is made available as a package in R. With all of these packages, we
are hard pressed to find anything at all related to statistics and data analysis,
including data graphics and visualization, that is not currently available in R.
As large as the set of R packages is, the capabilities of R do not stop here. R is a
high-level and expressive programming language that is specialized to efficiently
manipulate and perform calculations or analyses on data. This entails that R can
be used programmatically to greatly increase the speed and efficiency of any
data analysis. More importantly, R can be extended by writing custom programs
and functions, which may then be packaged and distributed for others to use.
While writing large or complex extension packages would require some
programming skill and experience, programming in R on a smaller and simpler
scale is in fact relatively easy, and basics can be mastered quickly. Given that R is
a programming language, there is then effectively no real limit on its capabilities.
The R programming language itself can be extended by interfacing with other
programming languages like C, C++, Fortran and Python. In particular, the
popular Rcpp package greatly simplifies integrating R with C++, thus allowing
fast and efficient C++ code to be used seamlessly within R. Likewise, R can be
easily interfaced with high-performance computing or big data tools like Hadoop,
Spark, SQL, parallel computing libraries, cluster computing, and so on.
Taken together, these points entail that R is an extremely powerful and extensible
environment for doing any kind of statistical computing or data analysis.
Open source software
R is free and open source software, distributed according to the GNU public licence.
Likewise, virtually all of the 16,000 or so contributed R packages are free and open
source software, with over 99% of them being distributed in accordance with one of
the major open source licences, such as GNU, MIT, BSD, Apache, Creative Commons
or Artistic. It is important to emphasize the distinctions in practice and in principle
between free and open source software, on the one hand, and freeware, on the
other. Freeware is proprietary software that is distributed, usually only in binary form
and with certain restrictions and conditions, at no monetary cost to the user. While it
can be used in a limited sense at no cost, it cannot be extended or developed, its
source code cannot be viewed, and its non-monetary cost can be revoked at any
time. Free and open source software, on the other hand, is licensed so that anyone
can use it and develop it in any manner, including and especially by viewing and
extending its source code. Free and open source software is defined by four essential
freedoms:1
1 https://www.gnu.org/philosophy/free-sw.html
The freedom to run the program in any manner and for any purpose
The freedom to study and modify the source code
The freedom to distribute copies of the original code
The freedom to distribute modified versions of the code.
In practical terms, the most obvious consequence of R’s free and open source nature
is that it is freely available for everyone to use, on more or less any device they
choose. It is mostly widely used on Windows, Macs, and Linux, but because it is
available in open source it can in principle be compiled for any platform, and can be
used on Android, iOS, Chrome OS, and many others. This means that anyone can use
R at any time anywhere and always at no cost. And because of its licence, this will
always be the case.
Open source software always has the potential to ‘go viral’ and develop a large self-
sustaining community of user/developers. This is precisely what has happened in the
case of R. Users are drawn in initially because it is available at no cost, can be used
on any platform, and has a large number of built-in or add-on tools. Because R is an
open platform, developers such as academic statisticians or data scientists who want
to reach a large audience write further add-on packages and make them publicly
available. This draws in more users. The users themselves may write blogs, books,
articles, or teach with R, thus attracting still more users, and so on.
Popularity
The Journal of Statistical Software2 is the most widely used academic journal
describing advances and developments in software for statistics. While it accepts
articles describing methods implemented in a wide variety of languages, it is
overwhelmingly dominated by programs written in R. This fact illustrates that when it
comes to the computational implementation of modern statistical methods, R is the
de facto standard.
2 https://www.jstatsoft.org
In an extensive analysis of general data science software (Muenchen, 2019), R is
ranked as one of the five most popular data science programs in jobs for data
scientists, and in multiple surveys of data scientists, it is often ranked as the first or
second mostly widely used data science tool, and among the most widely ‘followed’
topics on Quora and LinkedIn. Likewise, despite being a domain-specific language,
according to many rankings of widely used programming and scripting languages
worldwide, R is currently highly ranked. Indeed, R is currently very highly ranked
according to many rankings of widely used programming languages of any kind. For
example, the latest RedMonk ratings3 place R at rank 13; the latest TIOBE ratings4
place R at rank 8; and the latest PYPL ratings5 place R at rank 7.
3 https://redmonk.com/sogrady/2020/07/27/language-rankings-6-20/
4 https://www.tiobe.com/tiobe-index
5 http://pypl.github.io/PYPL.html
Figure 2.1 The typical layout of the RStudio Desktop when it is first opened
We will see three main windows, which we will describe in more detail below. For
now, we see that to the left, occupying about half the screen by default, is a window
with the console pane. To the right, there are two other windows arranged vertically.
Usually (in fact almost always) we also have two windows on the left. On top of the
window with the console pane, we usually have a script editor. If it is not present, it
can be brought up by the key command Ctrl+Shift+N (Cmd+Shift+N on Macs), or by
going to the File menu at the top of the screen and choosing New File > R Script.
Doing so will create a blank and untitled R script in the script editor window, which
will now occupy the upper left quadrant of the screen. Our screen should now look
like Figure 2.2.
Figure 2.2 RStudio Desktop with R script editor window in upper left quadrant
We’ll now look in more detail at each of these four windows.
Console window The console window usually occupies about half of the left-
hand side of the screen. Like all windows, it can be resized with the mouse, or
with the window resize buttons to the upper right of each window. In the
console window, by default, there are tabs for three panes: the Console, the
Terminal, and Jobs. The tab for the console is usually the active one. This is
where we type R commands, followed by Enter, and get all our results and
output. It is the single most important part of the RStudio Desktop. We will use it
extensively, beginning with our introduction to R commands in the next section.
The next tab is the Terminal, and this is the command line interface to our
computer’s operating system. So, on Windows, this is usually the DOS command
line. On Macs and Linux, it is the Unix shell, such as the bash shell. Unlike the
console, the terminal is not as widely used. It may in fact never be used, and is
only necessary when we need a command line interface to our operating system.
The Jobs pane is also not as widely used. It is for running scripts
asynchronously.
Script editor window The script editor is where we write scripts of R
commands. We use scripts whenever we want to save our R commands for later
reuse, or whenever the R commands are becoming relatively long and complex.
We write here just as we would write in any text file editor (e.g. Notepad), and
we can save these files on our computer’s file system as normal. As we’ll see
below, we can execute or run the commands we write in any script either line by
line or region by region or by executing the whole script at once. If we run
individual lines or regions, the R code effectively gets copied to the console
followed by Enter just as if we copied and pasted from the editor to the console.
We can run the whole script at once, as we’ll see below, by using the source
command, for which there is a button on the upper right of the editor window.
Environment, History, etc., window In the upper right pane, there are tabs
for the Environment, History, and Connections panes. Sometimes there are other
tabs, such as for Build and Git. Of these, Environment and History are the more
commonly used. As we’ll see as soon as we start using R commands and
creating variables, the details of the variables and data structures that are in our
current R session are listed in the Environment. We will have the option of
clearing or deleting any or all of these whenever we wish. Likewise, we may save
all of these objects to file, and reload them later. The Environment window also
provides us with a convenient means of importing data files. The History window
provides a list of all the R commands that we have typed. This is a particularly
useful feature, as we will see. It allows us to review everything we have typed in
our R session, and allows us to extract and rerun any commands we want. We
may also save our command history to file at any time. The other tabs available
in this window are not usually used as often. Connections is used for connecting
with databases or clusters, and we will see it again later in the book. Build,
which may not be listed at all, is for building packages or compiling code. Git,
which likewise may not exist, is for version control of our code using Git. We will
talk about using Git for version control in Chapter 6.
Files, Plots, Packages, Help, Viewer window The lower right window
provides tabs for browsing files, viewing plots, managing R packages, reading
help files, and for viewing html documents. The Files window is a regular file
browser, where we can view, create, delete, etc., files and directories. The Plots
window is where all figures created during our R session are shown. If we
produce many figures, they are placed in a stack and we can move forwards and
backwards between them. The Packages window, which we will return to in
more detail in the next section, lists all our installed R packages. From here, we
can also activate packages for our current R session, as well as install new
packages. The Help window displays help pages for any R command or package.
We can browse through these pages, but a particularly useful feature is how we
can jump straight to a needed help page for a command or package directly
from the R command line or script editor. We will return to this feature in one of
the next sections. The Viewer window is where we can view html pages that are
created in RStudio. These could include Shiny web apps or the html pages
produced by RMarkdown documents. These are topics to which we will return in
later chapters.
RStudio menus
At the top of the RStudio Desktop there is the following set of menus.
File The File menu is primarily for the opening, closing and saving of files. Often,
these files will be R scripts that open in the script editor. But they could also be R
data files, RMarkdown documents, Shiny apps, etc. Here, we can also open and
close RStudio projects, which is a very useful organizing feature to which we will
return below.
Edit The Edit menu primarily provides tools for standard file editing operations
such as copy, cut, paste, search and replace, and undo and redo. It also provides
code folding features, which is very useful for reducing clutter when editing
relatively large R scripts.
Code The Code menu provides many useful tools for making editing and running
code considerably easier and more efficient. We will explore these features in
more depth in subsequent sections, but they include adding and removing code
comments, reformatting code, jumping to functions within and between scripts,
and creating code regions that can then be run independently.
View The View menu primarily provides options to move around RStudio quickly.
These options are all bound to key combinations, as are many other RStudio
features, and learning these key combinations is certainly worthwhile because of
the eventual speed and efficiency gains that they provide.
Plots The Plots menu primarily provides features that are also available in the
Plots window itself.
Session The Session menu allows us to start new separate RStudio sessions.
These then run independently of one another. Also in the Session menu, we can
restart the R session in the background, which is a useful feature. Remember
that RStudio itself is just a front to an R session that runs in the background.
Sometimes it is a good idea to restart the R session so as to start in a clean and
fresh state. This can be done through the Session menu Restart R option, which
is also bound to the key combination Ctrl+Shift+F10. Also in Session are options
to set R’s working directory. The concept of a working directory is a simple but
important one, and we will cover it below.
Build The Build menu provides features for running scripts for software builds.
This is particularly used for creating R packages.
Debug The Debug menu provides tools for debugging our R code. Debugging
usually only becomes a necessity when R programming per se, and not
something that is usually required when writing individual commands or scripts
of commands.
Profile The Profile menu provides tools for profiling the running and efficiency
of our R code. Code efficiency is certainly not something that those new to R
need to worry about, but when writing relatively complex code, profiling can
identify bottlenecks.
Tools The Tools menu provides miscellaneous tools such as for working with
version control using Git, accessing the computer operating system’s command
line interface, installing and updating packages (as could also be done in the
Package window), and viewing and modifying keyboard shortcuts. Here, we can
also access the Global Options and Project Options. Global Options is where all
the general R and RStudio settings are set. One immediately useful setting here
is the Appearance setting, which can allow us to change the font, font size, and
colour theme of RStudio to suit our preference. Project Options are for the
RStudio project-specific settings. We will return to these below.
Help The Help menu provides much the same information as can be found in
the Help window. It also provides some additional links to online resources, such
as RStudio’s cheat sheets,6 which are excellent concise guides to many different
R and RStudio topics. Also available in the Help menu are tools to access RStudio
internal diagnostics. This is only needed if RStudio seems to be malfunctioning.
6 https://www.rstudio.com/resources/cheatsheets/
It was, indeed, a clever bit of detective work that had enabled Nick Carter
to form a theory consistent with all of the circumstances and the
accumulation of evidence denoting that Arthur Gordon was guilty of the
basest of treachery and the most heinous of crimes, and which would have
been convincing not only to the public, but probably to all other detectives
than Nick Carter himself.
He keenly realized, however, that a theory based only upon his own
convictions was not enough, that absolute evidence was needed to convince
others, and he was not long in hitting upon a plan by which he thought he
could obtain it.
Nick hurriedly explained it to Patsy, giving him a few necessary
instructions, and he then sent him to call the suspected man from the kitchen.
Henley came slouching into the library a moment later, with Ginger
trailing at his heels. He had a more lowering look in his shifty eyes. He had
become impatient and suspicious during his long wait. He did not fancy his
having been excluded from the conference of the detectives. It smacked of
distrust of him, and his resentment was manifest in his swarthy face.
Nick saw it, of course, and at once took steps to dispel it.
“Pardon me, Henley, for keeping you waiting so long,” he apologized
with a heartiness well calculated to be convincing. “I had no idea it would
take more than a few minutes to examine these articles. Sorry to have kept
you waiting.”
“That’s all right, Mr. Carter,” growled Henley, with countenance lighting.
“Time ain’t wuth much to me. I reckoned you’d want a good look at them.”
“I have examined them carefully, Henley.”
“What d’ye think about it?”
“It looks like a bad mess, very bad,” Nick said, more gravely.
“So it does,” Henley nodded. “There ain’t nothing to it but murder, that I
can see.”
“I’m inclined to agree with you,” Nick replied.
“Sure thing, chief,” put in Patsy. “What else can you make of it? It’s dead
lucky we met Mr. Henley. He sure has put us on the right track.”
“And he can do still more to aid us,” supplemented Nick approvingly. “I
suppose, Henley, you are perfectly willing to assist us. You will be well paid
for your services. I guarantee that.”
“Your word’s good enough for me, Mr. Carter,” said Henley, consenting
with a readiness denoting that his misgivings were entirely dispelled. “I’m
right here to lend you a hand. Say what you want, sir, and I’ll do it.”
“Good enough,” Nick declared. “We’ll set about it at once. Find the
butler, Patsy, and have him give you a pair of Gordon’s shoes. I will look
after those left by the girl. We’ll leave these other articles until we return. I’ll
take the precaution, however, to lock the library door. Get Gordon’s shoes
and rejoin us in the car.”
Patsy hastened from the room, then started upstairs to say a few
encouraging words to Strickland and Wilhelmina.
“I wish to visit the spot where you found these garments, Henley, or
where Ginger nosed them out, to be more correct,” said Nick, taking only the
pair of button boots from the table and thrusting them into his pocket.
“I’ll show you,” said Henley. “That won’t take long.”
“We will expedite matters by going in my car as far as possible,” Nick
added. “Bring along the dog. We may find him useful.”
“He’s some dog, Mr. Carter; you can bet on that.”
“He looks it, Henley, no mistake. One moment while I lock this door and
remove the key. Now, then, we’re off.”
Nick led the way out to the touring car, in which Patsy presently joined
them, bringing a pair of Gordon’s shoes, and in another moment they were
speeding down the long driveway toward the woodland road.
“Take us to the point where we picked Henley up, Danny,” Nick directed.
“He then can take the ribbons and show us the way.”
“You can run a quarter mile farther,” said Henley. “That’ll take us to the
crossroad. It’s rough going, then, too rough for a buzz car.”
“We will walk the remaining distance, Henley, in that case,” Nick replied,
all the while with an air of friendliness and appreciation of his services that
appeared to deceive the swarthy ruffian. “I think you said it is less than a
mile from the road to the pond you mentioned.”
“ ’Tain’t more than half a mile.”
“Just where did you see Gordon and the girl last evening?”
“Going through the crossroad.”
“We traced them to the juncture of the two roads.”
“It was a quarter mile from there that I saw them.”
“Was Gordon carrying a suit case?”
“That’s what,” nodded Henley. “The girl had her jacket over her arm. The
man had an ugly look, and they seemed to be in a fuss over something, but I
couldn’t hear what they said. I watched them till they turned a bend in the
road, and that was the last I saw of them.”
“Gordon looked threatening, did he?”
“I sure would have thought so, Mr. Carter, if he had been looking at me,”
Henley forcibly declared. “He looked fit to fight a dog.”
If Nick Carter had wanted further evidence of Henley’s complicity in the
knavish game that was being played, these last statements would have
convinced him of it, in view of his own discoveries and deductions. He did
not betray his suspicions, but pretended to have entire confidence in the
rascal, interrogating him along much the same lines until Danny brought the
car to a stop at the crossroad.
Nick was the first to alight, followed by Henley and the hound, while
Patsy paused to question:
“Am I to go with you, chief?”
Nick hesitated for a moment, as if he had given this matter no previous
thought, and he then said abruptly:
“No, you’ll not be needed. Henley and I can look over the ground and
accomplish all that can be done.”
“Sure we can,” put in Henley, with ill-concealed eagerness.
“You return with Danny, Patsy, and keep an eye on those things in the
library. There is a bare possibility that some one will try to destroy them, in
case our suspicions are known.”
“That’s right, too,” Patsy quickly agreed. “I thought you were taking a
chance, chief, in leaving them there.”
“You return and look after them,” Nick repeated decidedly. “I’ll hoof it
back with Henley after making an investigation. He won’t mind the tramp.”
“Mind it be hanged!” cried Henley. “Tramping round these diggings is
the most that I do.”
“That settles it, then,” said Nick. “Back into the crossroad to make a turn,
Danny, and wait for us at Gordon’s place.”
“I’ve got you, chief,” nodded Patsy. “We’ll keep an eye on things.”
Nick did not hasten his departure with Henley. He waited until Danny had
turned the touring car, then watched it speed away with both of his
assistants, till it vanished around a near bend in the road.
Henley stood silently watching him, with his shotgun under his arm.
There was a gleam of secret satisfaction deep down in his shifty eyes, an
ominous curve in his thin-lipped mouth. Both vanished instantly, however,
when Nick turned and said:
“Now, Henley, it’s up to you.”
“I’ll make good, all right,” was the reply, with a covert significance the
detective was quick to notice.
“Lead the way, then.”
“I’ll soon show you, Mr. Carter,” Henley added, with the same sinister
significance. “Come on, Ginger. He’s some dog, Carter, some dog. Ginger
can’t be beat.”
Nick did not reply. He followed the swarthy ruffian over the rough
crossroad, stopping at intervals to study the ground, stating that he wanted to
examine the footprints of the missing couple, if any could be found. He
delayed frequently in this way—but with an entirely different object in view.
Twenty minutes brought them to a path through the woodland, into which
Henley struck without hesitation, remarking grimly:
“They must have gone this way. It was on this side of the pond that
Ginger nosed out the bloodstained togs.”
“How far is the pond from here?” Nick inquired, following him.
“Not far,” Henley gruffly assured him. “It’s over the hill and down into
the valley. There’s another path on t’other side of it, leading to a road
running south.”
“Toward Fordham, then.”
“That’s what. Gordon must have known about the pond. ’Tain’t very big,
but it’s as deep as a volcano. The devil himself couldn’t raise a corpse sunk
to the bottom of it. Gordon knew that, mebbe.”
“Quite likely, Henley, since he evidently wanted to get rid of the girl,”
Nick allowed.
“That’s how it looks to me. Bear off this way, sir.”
Henley strode away to the left and plunged through the bushes and
underbrush, Nick following, with Ginger bringing up in the rear.
Ten minutes brought them in sight of the pond, shut in on all sides by a
thick belt of woods, and Nick followed his uncouth guide down to the edge
of it and to the spot he was seeking, a lonely and suitable place enough for
such a crime as superficially appeared to have been committed.
“Here’s the spot,” cried Henley, pointing to some trampled shrubs and
underbrush. “There’s the log where Ginger nosed out the girl’s hat and
jacket. They were rolled up and thrust under it, then partly covered with dirt
and leaves.”
“Yes, yes, I see.”
“Here’s blood on the bushes, and footprints in the ground and dry leaves,
as if the girl put up a fight to save herself from——”
“Stop a moment,” said Nick, intently viewing the evidence mentioned. “I
want to compare these shoes with the imprints.”
“Gordon’s shoes?”
“Yes. The button boots belong to the girl. She left them in a house where
she has been boarding.”
“You went there after them?” questioned Henley, with sinister scrutiny.
“Yes, certainly,” said Nick, without looking up. “By Jove, they
correspond perfectly, Henley. There’s no question about it.”
Nick was comparing both pieces of footwear with several impressions
found in the damp earth. There was, as he had stated, no question as to the
correspondence in size and shape, which was further evidence of who had
been there the previous evening.
“It looks bad, bad enough,” he added, after viewing the blood-spattered
bushes, the rough ground on all sides, and seeking vainly for evidence
showing in which direction Gordon had departed.
“You have made no search for the girl’s body, Henley, you said.”
“What’s the use?” Henley asked, with a growl. “A hundred to one it’s at
the bottom of the pond.”
“Very likely,” admitted Nick, with seeming uncertainty as to what course
to take.
“Gordon wouldn’t have waited to bury it.”
“True again,” Nick allowed. “If we only knew in which direction he went
——”
“We can find that out easy enough,” Henley interrupted, with eyes
gleaming for an instant.
“How so?” asked Nick, though he had expected and been only waiting for
these suggestions. “How can we contrive to trace him?”
“Leave it to Ginger.”
“You mean——”
“Ginger will show you,” Henley cut in. “He can trail him like breaking
sticks. He’s some dog. Mr. Carter, some dog. Wait a bit and I’ll show you.
Gimme one of Gordon’s shoes.”
“By Jove, that’s a good idea, Henley.” Nick cried, as if he had not thought
of it. “He can get the scent from this, perhaps, as you suggest. I ought to
have been wise to that.”
“Here you, Ginger, come here,” Henley growled harshly. “Come here,
you rascal.”
The hound bounded through the bushes and cringed at his master’s feet.
Henley seized him by the scruff of the neck and held to his nostrils the
shoe the detective had given him, then pointed to the larger of the imprints in
the ground.
“Get after him, Ginger!” he commanded, producing a leather strap and
hooking it to the dog’s collar. “Follow him up! After him, Ginger, you
rascal!”
The hound brightened up and appeared to know what was wanted. He
began to bark, until Henley cuffed him fiercely, and then he thrust his
muzzle to the ground, whining and eagerly tugging hard on the leather leash.
Henley seized his shotgun from the ground where he had placed it, crying
gruffly:
“I told you, Carter. He’s got the scent. Come on at my heels. Ginger’ll
trail him.”
“By Jove, I believe you are right, Henley,” Nick cried, following.
“I know I’m right. He’s some dog, sir, some dog.”
“Some dog, Henley, no mistake.”
“Can you stick close?”
“Bet you!” said Nick, as both plunged on after the hound. “You can’t go
too fast for me.”
“Sing out if I do.”
“I’ll hang on, all right. Want me to carry your gun?”
“Not much!” growled Henley. “I’m used to this ’ere business.”
“Gordon evidently went round the pond, instead of back to the
crossroad.”
“That’s so. He most likely was heading for the other road.”
“It looks so, for fair.”
“Ginger’ll trail him. Leave it to Ginger.”
The hound was plunging on all the while, with his muzzle to the ground,
and was shaping a course through the woods and around the south side of the
pond.
“Plainly enough, whoever planted this evidence wore the shoes Gordon
had been wearing,” thought Nick, tramping rapidly on behind Henley.
“That’s evidence enough, too, that he now is in the hands of this rascal’s
confederates. It would be like Mortimer Deland not to overlook a point as
essential as that. Where will the trail end? That’s the question.”
It then was, in fact, almost the only important question in Nick Carter’s
mind. He felt that he had a correct answer for all of the others. He was not
left long in uncertainty, however, for the trail was not a very long one.
Ten minutes brought them to a narrow road on the south side of the pond,
though a quarter mile from it, and the hound started off to the left without a
moment’s hesitation.
Another eighth of a mile brought them to what evidently was an
extensive private estate. There were low walls through the woods, and away
off to the right could be seen at intervals, when the trees and foliage did not
hide them, the white stones and monuments of a distant cemetery.
“Whose place is this, Henley?” Nick inquired, while both scrambled over
a low wall over which the hound had leaped. “Do you know who owns this
estate?”
“Sure I know,” growled Henley, over his shoulder. “I know every place in
these parts.”
“Whose is it?”
“It’s owned by a man named Barker, Colonel Morgan Barker, but he’s in
Europe with his family. The house hasn’t been open for a year.”
Nick remembered the man and the place, also the Barker tomb, in which
Mortimer Deland had temporarily concealed the art treasures stolen from
Rudolph Strickland’s flat in Fifth Avenue, and from which gruesome
confinement Nick had rescued Patsy Garvan on the night of the round-up.
No additional evidence was needed to convince him that he had hit the
nail on the head, that Pauline Perrot and Mortimer Deland were one and the
same, and that this notorious European crook was back of the knavery then
in progress.
“It’s dollars to doughnuts, now, that the rascal has taken secret possession
of Barker’s unoccupied house,” Nick said to himself. “It’s the old Barker
homestead, and sufficiently isolated to serve Deland admirably for such a
job. He knew all about it, too, and that he would ordinarily be safe from
intruders. I’ll butt in on him, now, in a way he’ll not fancy.”
The last scarce had crossed Nick’s mind when they emerged into the
cleared land back of the large old country house, stable, and outbuildings.
Ginger was still tugging on the leash and leading the way between the
buildings and toward the rear of the fine old dwelling.
Not a word now came from Henley.
Nick glanced sharply at the house while they approached it. Shutters
protected all of the lower windows. The curtains at those on the upper floors
were closely drawn. The surrounding grounds, an eighth of a mile from the
nearest road, shut in by the trees of an extensive park, were entirely deserted
and running to rank grass and weeds.
When within ten yards of the rear door, toward which the hound was
heading, Nick said abruptly:
“Stop a moment, Henley. If our man is here——”
“He’s here, Carter, all right,” Henley cut in gruffly.
He swung round while he spoke and dropped the leash, then threw his
shotgun into the hollow of his arm, instantly covering the detective.
“He’s here, Carter,” he added, with sinister significance. “Don’t you
reach for a gun. Don’t move, blast you, or I’ll pepper you so with buckshot
that you’ll look like a sieve.”
CHAPTER VIII.
FACE TO FACE.
Patsy Garvan did not ride far with Danny Maloney after their parting
from Nick Carter and Henley. Glancing back over his shoulder, Patsy waited
only until they had rounded the curve in the road, when he called quickly:
“Slow down, Danny, and drop me. We’re out of sight.”
Danny obeyed at once, saying regretfully:
“Gee! I wish I was going with you. I might be needed.”
“One is better than two,” Patsy replied, leaping down to the road.
“There’s only half the risk of being seen. I can fill the bill, all right, single-
handed.”
“So long, then, and good luck.”
“Same to you.”
Danny sped on with the car.
Patsy Garvan, however, plunged into the woods, at once shaping a course
that would bring him in sight of the crossroad through which Nick and
Henley were to pass.
It was to enable Patsy to make this detour that Nick repeatedly stopped on
the road, pretending he wanted to find footprints left by the missing couple.
Patsy accomplished the move with no great difficulty, and entirely
unsuspected by Henley, owing to the artful attitude toward him that Nick had
assumed.
Patsy saw them pass along the road; in fact, saw them on the edge of the
pond, and then he followed them at a discreet distance until, from behind
one of the outbuildings, he saw Nick held up by Henley and afterward taken
into the house.
“Gee! that does settle it,” he said to himself. “I must know who is there
and what’s going to come off, but it won’t do for me to approach the house
from this side. Those rats are in the rear rooms, or a side one, or they could
not have reached the back door so quickly after Henley whistled. I’ll make a
circuit to the front road and have a look.”
It took Patsy several minutes to do so, seeking the shelter of a wall over
which he could plainly see the front of the dwelling, and he then met with an
agreeable surprise.
A familiar whistle fell upon his ears, and he turned and discovered Chick
under the same wall.
“Gee whiz!” he exclaimed, when they met. “This is dead lucky, for fair.”
“It’s not all luck, I guess,” Chick replied. “Give the chief the credit for
it.”
“You found your man?”
“I arrived just in time to see him leaving his office.”
“He must be out here, now, since you are here.”
“That’s what,” Chick nodded. “He went round to the back door of the
house about ten minutes ago. I’ve been waiting and watching till I could get
a line on what’s going on in there.”
“Gee! I can supply that line, all right,” chuckled Patsy.
“Cut loose, then,” said Chick.
Patsy informed him with very few words what had occurred, and the
subterfuge Nick had employed.
“It now is up to us, Chick,” he added. “The gang we want is in that house,
and probably Arthur Gordon. We must go in and get them. There’s nothing
else to it.”
“Only one thing,” corrected Chick, who again was sizing up the house.
“What’s that?”
“The way to get in, Patsy, so as to catch them hands down. It’s a hundred
to one that they are on the ground floor, also in one of the rear rooms, as you
have said.”
“It’s a safe gamble, Chick, in my opinion.”
“And I am equally sure that we could not force any of the lower windows
without being heard. We can take a chance and approach the front of the
house, and by climbing that trellis at the east end of the veranda, we can
reach the veranda roof and three of the second-floor windows.”
“Like breaking sticks,” nodded Patsy approvingly. “It’s dollars to
doughnuts that we then can quietly force one of the windows.”
“I think so, too.”
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com