100% found this document useful (1 vote)
12 views

R Programming for Data Science Roger D. Peng - Download the ebook now for an unlimited reading experience

The document provides information about various eBooks available for download at ebookmeta.com, including titles related to R programming and data science. It lists several recommended digital products and includes links to each eBook. Additionally, it outlines the contents of 'R Programming for Data Science' by Roger D. Peng, detailing its chapters and topics covered.

Uploaded by

toubiaeyja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
12 views

R Programming for Data Science Roger D. Peng - Download the ebook now for an unlimited reading experience

The document provides information about various eBooks available for download at ebookmeta.com, including titles related to R programming and data science. It lists several recommended digital products and includes links to each eBook. Additionally, it outlines the contents of 'R Programming for Data Science' by Roger D. Peng, detailing its chapters and topics covered.

Uploaded by

toubiaeyja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Read Anytime Anywhere Easy Ebook Downloads at ebookmeta.

com

R Programming for Data Science Roger D. Peng

https://ebookmeta.com/product/r-programming-for-data-
science-roger-d-peng/

OR CLICK HERE

DOWLOAD EBOOK

Visit and Get More Ebook Downloads Instantly at https://ebookmeta.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

The Art of Data Science Roger D. Peng

https://ebookmeta.com/product/the-art-of-data-science-roger-d-peng/

ebookmeta.com

Functional Programming in R 4: Advanced Statistical


Programming for Data Science, Analysis, and Finance -
Second Edition Thomas Mailund
https://ebookmeta.com/product/functional-programming-in-r-4-advanced-
statistical-programming-for-data-science-analysis-and-finance-second-
edition-thomas-mailund/
ebookmeta.com

R Programming for Actuarial Science 1st Edition Mcquire

https://ebookmeta.com/product/r-programming-for-actuarial-science-1st-
edition-mcquire/

ebookmeta.com

Cultural Histories of Ageing Myths Plots and Metaphors of


the Senescent Self 1st Edition Margery Vibe Skagen
(Editor)
https://ebookmeta.com/product/cultural-histories-of-ageing-myths-
plots-and-metaphors-of-the-senescent-self-1st-edition-margery-vibe-
skagen-editor/
ebookmeta.com
Introduction to Banking 3rd Edition Claudia Girardone

https://ebookmeta.com/product/introduction-to-banking-3rd-edition-
claudia-girardone/

ebookmeta.com

An Analysis of Geoffrey Parker s Global Crisis War Climate


Change and Catastrophe in the Seventeenth Century 1st
Edition Ian Jackson
https://ebookmeta.com/product/an-analysis-of-geoffrey-parker-s-global-
crisis-war-climate-change-and-catastrophe-in-the-seventeenth-
century-1st-edition-ian-jackson/
ebookmeta.com

Cross my Heart Steamy in Sweetville 10 1st Edition Haven


Rose

https://ebookmeta.com/product/cross-my-heart-steamy-in-
sweetville-10-1st-edition-haven-rose/

ebookmeta.com

The Bitcoin Dilemma: Weighing The Economic And


Environmental Costs And Benefits 1st Edition Colin L. Read

https://ebookmeta.com/product/the-bitcoin-dilemma-weighing-the-
economic-and-environmental-costs-and-benefits-1st-edition-colin-l-
read/
ebookmeta.com

Pennsylvania Dutch The Story of an American Language 1st


Edition Mark L. Louden

https://ebookmeta.com/product/pennsylvania-dutch-the-story-of-an-
american-language-1st-edition-mark-l-louden/

ebookmeta.com
Religious Giving For Love of God 1st Edition David H Smith

https://ebookmeta.com/product/religious-giving-for-love-of-god-1st-
edition-david-h-smith/

ebookmeta.com
R Programming for Data Science
Roger D. Peng
© 2014 - 2016 Roger D. Peng
Also By Roger D. Peng
The Art of Data Science
Exploratory Data Analysis with R
Report Writing for Data Science in R
Contents

1. Stay in Touch! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2. Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3. History and Overview of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


3.1 What is R? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 What is S? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3 The S Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Back to R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.5 Basic Features of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.6 Free Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.7 Design of the R System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.8 Limitations of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.9 R Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4. Getting Started with R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12


4.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Getting started with the R interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5. R Nuts and Bolts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


5.1 Entering Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5.3 R Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.4 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.5 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.6 Creating Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.7 Mixing Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.8 Explicit Coercion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.9 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.10 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.11 Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.12 Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.13 Data Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.14 Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
CONTENTS

5.15 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

6. Getting Data In and Out of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24


6.1 Reading and Writing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2 Reading Data Files with read.table() . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Reading in Larger Datasets with read.table . . . . . . . . . . . . . . . . . . . . . . . 25
6.4 Calculating Memory Requirements for R Objects . . . . . . . . . . . . . . . . . . . 26

7. Using the readr Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8. Using Textual and Binary Formats for Storing Data . . . . . . . . . . . . . . . . . . . 29


8.1 Using dput() and dump() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
8.2 Binary Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

9. Interfaces to the Outside World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


9.1 File Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
9.2 Reading Lines of a Text File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
9.3 Reading From a URL Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

10. Subsetting R Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


10.1 Subsetting a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
10.2 Subsetting a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
10.3 Subsetting Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
10.4 Subsetting Nested Elements of a List . . . . . . . . . . . . . . . . . . . . . . . . . . 40
10.5 Extracting Multiple Elements of a List . . . . . . . . . . . . . . . . . . . . . . . . . 41
10.6 Partial Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
10.7 Removing NA Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

11. Vectorized Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44


11.1 Vectorized Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

12. Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


12.1 Dates in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
12.2 Times in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
12.3 Operations on Dates and Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
12.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

13. Managing Data Frames with the dplyr package . . . . . . . . . . . . . . . . . . . . . . 50


13.1 Data Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
13.2 The dplyr Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
13.3 dplyr Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
13.4 Installing the dplyr package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
13.5 select() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
13.6 filter() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
13.7 arrange() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
CONTENTS

13.8 rename() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
13.9 mutate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
13.10 group_by() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
13.11 %>% . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
13.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

14. Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63


14.1 if-else . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
14.2 for Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
14.3 Nested for loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
14.4 while Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
14.5 repeat Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
14.6 next, break . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
14.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

15. Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
15.1 Functions in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
15.2 Your First Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
15.3 Argument Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
15.4 Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
15.5 The ... Argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
15.6 Arguments Coming After the ... Argument . . . . . . . . . . . . . . . . . . . . . 78
15.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

16. Scoping Rules of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80


16.1 A Diversion on Binding Values to Symbol . . . . . . . . . . . . . . . . . . . . . . . 80
16.2 Scoping Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
16.3 Lexical Scoping: Why Does It Matter? . . . . . . . . . . . . . . . . . . . . . . . . . 82
16.4 Lexical vs. Dynamic Scoping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
16.5 Application: Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
16.6 Plotting the Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
16.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

17. Coding Standards for R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

18. Loop Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90


18.1 Looping on the Command Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
18.2 lapply() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
18.3 sapply() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
18.4 split() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
18.5 Splitting a Data Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
18.6 tapply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
18.7 apply() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
18.8 Col/Row Sums and Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
CONTENTS

18.9 Other Ways to Apply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103


18.10 mapply() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
18.11 Vectorizing a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
18.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

19. Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108


19.1 Before You Begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
19.2 Primary R Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
19.3 grep() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
19.4 grepl() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
19.5 regexpr() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
19.6 sub() and gsub() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
19.7 regexec() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
19.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

20. Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119


20.1 Something’s Wrong! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
20.2 Figuring Out What’s Wrong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
20.3 Debugging Tools in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
20.4 Using traceback() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
20.5 Using debug() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
20.6 Using recover() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
20.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

21. Profiling R Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127


21.1 Using system.time() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
21.2 Timing Longer Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
21.3 The R Profiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
21.4 Using summaryRprof() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
21.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

22. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133


22.1 Generating Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
22.2 Setting the random number seed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
22.3 Simulating a Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
22.4 Random Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
22.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

23. Data Analysis Case Study: Changes in Fine Particle Air Pollution in the U.S. . . . . . 141
23.1 Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
23.2 Loading and Processing the Raw Data . . . . . . . . . . . . . . . . . . . . . . . . . 141
23.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

24. About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151


1. Stay in Touch!
Thanks for purchasing this book. If you are interested in hearing more from me about things that
I’m working on (books, data science courses, podcast, etc.), you can do two things.
First, I encourage you to join my mailing list of Leanpub Readers¹. On this list I send out updates
of my own activities as well as occasional comments on data science current events. I’ll also let you
know what my co-conspirators Jeff Leek and Brian Caffo are up to because sometimes they do really
cool stuff.
Second, I have a regular podcast called Not So Standard Deviations² that I co-host with Dr. Hilary
Parker, a Senior Data Analyst at Etsy. On this podcast, Hilary and I talk about the craft of data
science and discuss common issues and problems in analyzing data. We’ll also compare how data
science is approached in both academia and industry contexts and discuss the latest industry trends.
You can listen to recent episodes on our SoundCloud page or you can subscribe to it in iTunes³ or
your favorite podcasting app.
Thanks again for purchasing this book and please do stay in touch!
¹http://eepurl.com/bAJ3zj
²https://soundcloud.com/nssd-podcast
³https://itunes.apple.com/us/podcast/not-so-standard-deviations/id1040614570

1
2. Preface
I started using R in 1998 when I was a college undergraduate working on my senior thesis.
The version was 0.63. I was an applied mathematics major with a statistics concentration and
I was working with Dr. Nicolas Hengartner on an analysis of word frequencies in classic texts
(Shakespeare, Milton, etc.). The idea was to see if we could identify the authorship of each of the
texts based on how frequently they used certain words. We downloaded the data from Project
Gutenberg and used some basic linear discriminant analysis for the modeling. The work was
eventually published¹ and was my first ever peer-reviewed publication. I guess you could argue
it was my first real “data science” experience.
Back then, no one was using R. Most of my classes were taught with Minitab, SPSS, Stata, or
Microsoft Excel. The cool people on the cutting edge of statistical methodology used S-PLUS. I
was working on my thesis late one night and I had a problem. I didn’t have a copy of any of those
software packages because they were expensive and I was a student. I didn’t feel like trekking over
to the computer lab to use the software because it was late at night.
But I had the Internet! After a couple of Yahoo! searches I found a web page for something called R,
which I figured was just a play on the name of the S-PLUS package. From what I could tell, R was a
“clone” of S-PLUS that was free. I had already written some S-PLUS code for my thesis so I figured
I would try to download R and see if I could just run the S-PLUS code.
It didn’t work. At least not at first. It turns out that R is not exactly a clone of S-PLUS and quite a few
modifications needed to be made before the code would run in R. In particular, R was missing a lot of
statistical functionality that had existed in S-PLUS for a long time already. Luckily, R’s programming
language was pretty much there and I was able to more or less re-implement the features that were
missing in R.
After college, I enrolled in a PhD program in statistics at the University of California, Los Angeles.
At the time the department was brand new and they didn’t have a lot of policies or rules (or classes,
for that matter!). So you could kind of do what you wanted, which was good for some students and
not so good for others. The Chair of the department, Jan de Leeuw, was a big fan of XLisp-Stat and
so all of the department’s classes were taught using XLisp-Stat. I diligently bought my copy of Luke
Tierney’s book² and learned to really love XLisp-Stat. It had a number of features that R didn’t have
at all, most notably dynamic graphics.
But ultimately, there were only so many parentheses that I could type, and still all of the research-
level statistics was being done in S-PLUS. The department didn’t really have a lot of copies of S-PLUS
lying around so I turned back to R. When I looked around at my fellow students, I realized that I
was basically the only one who had any experience using R. Since there was a budding interest in R
¹http://amstat.tandfonline.com/doi/abs/10.1198/000313002100#.VQGiSELpagE
²http://www.amazon.com/LISP-STAT-Object-Oriented-Environment-Statistical-Probability/dp/0471509167/

2
Preface 3

around the department, I decided to start a “brown bag” series where every week for about an hour
I would talk about something you could do in R (which wasn’t much, really). People seemed to like
it, if only because there wasn’t really anyone to turn to if you wanted to learn about R.
By the time I left grad school in 2003, the department had essentially switched over from XLisp-
Stat to R for all its work (although there were a few hold outs). Jan discusses the rationale for the
transition in a paper³ in the Journal of Statistical Software.
In the next step of my career, I went to the Department of Biostatistics⁴ at the Johns Hopkins
Bloomberg School of Public Health, where I have been for the past 12 years. When I got to Johns
Hopkins people already seemed into R. Most people had abandoned S-PLUS a while ago and were
committed to using R for their research. Of all the available statistical packages, R had the most
powerful and expressive programming language, which was perfect for someone developing new
statistical methods.
However, we didn’t really have a class that taught students how to use R. This was a problem because
most of our grad students were coming into the program having never heard of R. Most likely in
their undergradute programs, they used some other software package. So along with Rafael Irizarry,
Brian Caffo, Ingo Ruczinski, and Karl Broman, I started a new class to teach our graduate students
R and a number of other skills they’d need in grad school.
The class was basically a weekly seminar where one of us talked about a computing topic of interest.
I gave some of the R lectures in that class and when I asked people who had heard of R before, almost
no one raised their hand. And no one had actually used it before. The main selling point at the time
was “It’s just like S-PLUS but it’s free!” A lot of people had experience with SAS or Stata or SPSS. A
number of people had used something like Java or C/C++ before and so I often used that a reference
frame. No one had ever used a functional-style of programming language like Scheme or Lisp.
To this day, I still teach the class, known a Biostatistics 140.776 (“Statistical Computing”). However,
the nature of the class has changed quite a bit over the past 10 years. The population of students
(mostly first-year graduate students) has shifted to the point where many of them have been
introduced to R as undergraduates. This trend mirrors the overall trend with statistics where we
are seeing more and more students do undergraduate majors in statistics (as opposed to, say,
mathematics). Eventually, by 2008–2009, when I’d asked how many people had heard of or used
R before, everyone raised their hand. However, even at that late date, I still felt the need to convince
people that R was a “real” language that could be used for real tasks.
R has grown a lot in recent years, and is being used in so many places now, that I think it’s
essentially impossible for a person to keep track of everything that is going on. That’s fine, but
it makes “introducing” people to R an interesting experience. Nowadays in class, students are often
teaching me something new about R that I’ve never seen or heard of before (they are quite good
at Googling around for themselves). I feel no need to “bring people over” to R. In fact it’s quite the
opposite–people might start asking questions if I weren’t teaching R.
³http://www.jstatsoft.org/v13/i07
⁴http://www.biostat.jhsph.edu
Preface 4

This book comes from my experience teaching R in a variety of settings and through different stages
of its (and my) development. Much of the material has been taken from by Statistical Computing
class as well as the R Programming⁵ class I teach through Coursera.
I’m looking forward to teaching R to people as long as people will let me, and I’m interested in
seeing how the next generation of students will approach it (and how my approach to them will
change). Overall, it’s been just an amazing experience to see the widespread adoption of R over the
past decade. I’m sure the next decade will be just as amazing.
⁵https://www.coursera.org/course/rprog
3. History and Overview of R
There are only two kinds of languages: the ones people complain about and the ones
nobody uses —Bjarne Stroustrup

Watch a video of this chapter¹

3.1 What is R?
This is an easy question to answer. R is a dialect of S.

3.2 What is S?
S is a language that was developed by John Chambers and others at the old Bell Telephone
Laboratories, originally part of AT&T Corp. S was initiated in 1976² as an internal statistical analysis
environment—originally implemented as Fortran libraries. Early versions of the language did not
even contain functions for statistical modeling.
In 1988 the system was rewritten in C and began to resemble the system that we have today (this
was Version 3 of the language). The book Statistical Models in S by Chambers and Hastie (the white
book) documents the statistical analysis functionality. Version 4 of the S language was released in
1998 and is the version we use today. The book Programming with Data by John Chambers (the
green book) documents this version of the language.
Since the early 90’s the life of the S language has gone down a rather winding path. In 1993 Bell Labs
gave StatSci (later Insightful Corp.) an exclusive license to develop and sell the S language. In 2004
Insightful purchased the S language from Lucent for $2 million. In 2006, Alcatel purchased Lucent
Technologies and is now called Alcatel-Lucent.
Insightful sold its implementation of the S language under the product name S-PLUS and built a
number of fancy features (GUIs, mostly) on top of it—hence the “PLUS”. In 2008 Insightful was
acquired by TIBCO for $25 million. As of this writing TIBCO is the current owner of the S language
and is its exclusive developer.
The fundamentals of the S language itself has not changed dramatically since the publication of the
Green Book by John Chambers in 1998. In 1998, S won the Association for Computing Machinery’s
Software System Award, a highly prestigious award in the computer science field.
¹https://youtu.be/STihTnVSZnI
²http://cm.bell-labs.com/stat/doc/94.11.ps

5
History and Overview of R 6

3.3 The S Philosophy


The general S philosophy is important to understand for users of S and R because it sets the stage for
the design of the language itself, which many programming veterans find a bit odd and confusing.
In particular, it’s important to realize that the S language had its roots in data analysis, and did not
come from a traditional programming language background. Its inventors were focused on figuring
out how to make data analysis easier, first for themselves, and then eventually for others.
In Stages in the Evolution of S³, John Chambers writes:

“[W]e wanted users to be able to begin in an interactive environment, where they


did not consciously think of themselves as programming. Then as their needs became
clearer and their sophistication increased, they should be able to slide gradually into
programming, when the language and system aspects would become more important.”

The key part here was the transition from user to developer. They wanted to build a language that
could easily service both “people”. More technically, they needed to build language that would
be suitable for interactive data analysis (more command-line based) as well as for writing longer
programs (more traditional programming language-like).

3.4 Back to R
The R language came to use quite a bit after S had been developed. One key limitation of the S
language was that it was only available in a commericial package, S-PLUS. In 1991, R was created
by Ross Ihaka and Robert Gentleman in the Department of Statistics at the University of Auckland. In
1993 the first announcement of R was made to the public. Ross’s and Robert’s experience developing
R is documented in a 1996 paper in the Journal of Computational and Graphical Statistics:

Ross Ihaka and Robert Gentleman. R: A language for data analysis and graphics. Journal
of Computational and Graphical Statistics, 5(3):299–314, 1996

In 1995, Martin Mächler made an important contribution by convincing Ross and Robert to use the
GNU General Public License⁴ to make R free software. This was critical because it allowed for the
source code for the entire R system to be accessible to anyone who wanted to tinker with it (more
on free software later).
In 1996, a public mailing list was created (the R-help and R-devel lists) and in 1997 the R Core
Group was formed, containing some people associated with S and S-PLUS. Currently, the core group
controls the source code for R and is solely able to check in changes to the main R source tree. Finally,
in 2000 R version 1.0.0 was released to the public.
³http://www.stat.bell-labs.com/S/history.html
⁴http://www.gnu.org/licenses/gpl-2.0.html
History and Overview of R 7

3.5 Basic Features of R


In the early days, a key feature of R was that its syntax is very similar to S, making it easy for
S-PLUS users to switch over. While the R’s syntax is nearly identical to that of S’s, R’s semantics,
while superficially similar to S, are quite different. In fact, R is technically much closer to the Scheme
language than it is to the original S language when it comes to how R works under the hood.
Today R runs on almost any standard computing platform and operating system. Its open source
nature means that anyone is free to adapt the software to whatever platform they choose. Indeed, R
has been reported to be running on modern tablets, phones, PDAs, and game consoles.
One nice feature that R shares with many popular open source projects is frequent releases. These
days there is a major annual release, typically in October, where major new features are incorporated
and released to the public. Throughout the year, smaller-scale bugfix releases will be made as needed.
The frequent releases and regular release cycle indicates active development of the software and
ensures that bugs will be addressed in a timely manner. Of course, while the core developers control
the primary source tree for R, many people around the world make contributions in the form of new
feature, bug fixes, or both.
Another key advantage that R has over many other statistical packages (even today) is its sophisti-
cated graphics capabilities. R’s ability to create “publication quality” graphics has existed since the
very beginning and has generally been better than competing packages. Today, with many more
visualization packages available than before, that trend continues. R’s base graphics system allows
for very fine control over essentially every aspect of a plot or graph. Other newer graphics systems,
like lattice and ggplot2 allow for complex and sophisticated visualizations of high-dimensional data.
R has maintained the original S philosophy, which is that it provides a language that is both useful
for interactive work, but contains a powerful programming language for developing new tools. This
allows the user, who takes existing tools and applies them to data, to slowly but surely become a
developer who is creating new tools.
Finally, one of the joys of using R has nothing to do with the language itself, but rather with the
active and vibrant user community. In many ways, a language is successful inasmuch as it creates a
platform with which many people can create new things. R is that platform and thousands of people
around the world have come together to make contributions to R, to develop packages, and help
each other use R for all kinds of applications. The R-help and R-devel mailing lists have been highly
active for over a decade now and there is considerable activity on web sites like Stack Overflow.

3.6 Free Software


A major advantage that R has over many other statistical packages and is that it’s free in the sense
of free software (it’s also free in the sense of free beer). The copyright for the primary source code
for R is held by the R Foundation⁵ and is published under the GNU General Public License version
⁵http://www.r-project.org/foundation/
History and Overview of R 8

2.0⁶.
According to the Free Software Foundation, with free software, you are granted the following four
freedoms⁷

• The freedom to run the program, for any purpose (freedom 0).
• The freedom to study how the program works, and adapt it to your needs (freedom 1). Access
to the source code is a precondition for this.
• The freedom to redistribute copies so you can help your neighbor (freedom 2).
• The freedom to improve the program, and release your improvements to the public, so that
the whole community benefits (freedom 3). Access to the source code is a precondition for
this.

You can visit the Free Software Foundation’s web site⁸ to learn a lot more about free software. The
Free Software Foundation was founded by Richard Stallman in 1985 and Stallman’s personal web
site⁹ is an interesting read if you happen to have some spare time.

3.7 Design of the R System


The primary R system is available from the Comprehensive R Archive Network¹⁰, also known as
CRAN. CRAN also hosts many add-on packages that can be used to extend the functionality of R.
The R system is divided into 2 conceptual parts:

1. The “base” R system that you download from CRAN: Linux¹¹ Windows¹² Mac¹³ Source Code¹⁴
2. Everything else.

R functionality is divided into a number of packages.

• The “base” R system contains, among other things, the base package which is required to run
R and contains the most fundamental functions.
• The other packages contained in the “base” system include utils, stats, datasets, graphics,
grDevices, grid, methods, tools, parallel, compiler, splines, tcltk, stats4.

⁶http://www.gnu.org/licenses/gpl-2.0.html
⁷http://www.gnu.org/philosophy/free-sw.html
⁸http://www.fsf.org
⁹https://stallman.org
¹⁰http://cran.r-project.org
¹¹http://cran.r-project.org/bin/linux/
¹²http://cran.r-project.org/bin/windows/
¹³http://cran.r-project.org/bin/macosx/
¹⁴http://cran.r-project.org/src/base/R-3/R-3.1.3.tar.gz
History and Overview of R 9

• There are also “Recommended” packages: boot, class, cluster, codetools, foreign, KernS-
mooth, lattice, mgcv, nlme, rpart, survival, MASS, spatial, nnet, Matrix.

When you download a fresh installation of R from CRAN, you get all of the above, which represents
a substantial amount of functionality. However, there are many other packages available:

• There are over 4000 packages on CRAN that have been developed by users and programmers
around the world.
• There are also many packages associated with the Bioconductor project¹⁵.
• People often make packages available on their personal websites; there is no reliable way to
keep track of how many packages are available in this fashion.
• There are a number of packages being developed on repositories like GitHub and BitBucket
but there is no reliable listing of all these packages.

3.8 Limitations of R
No programming language or statistical analysis system is perfect. R certainly has a number of
drawbacks. For starters, R is essentially based on almost 50 year old technology, going back to the
original S system developed at Bell Labs. There was originally little built in support for dynamic or
3-D graphics (but things have improved greatly since the “old days”).
Another commonly cited limitation of R is that objects must generally be stored in physical memory.
This is in part due to the scoping rules of the language, but R generally is more of a memory hog
than other statistical packages. However, there have been a number of advancements to deal with
this, both in the R core and also in a number of packages developed by contributors. Also, computing
power and capacity has continued to grow over time and amount of physical memory that can be
installed on even a consumer-level laptop is substantial. While we will likely never have enough
physical memory on a computer to handle the increasingly large datasets that are being generated,
the situation has gotten quite a bit easier over time.
At a higher level one “limitation” of R is that its functionality is based on consumer demand and
(voluntary) user contributions. If no one feels like implementing your favorite method, then it’s your
job to implement it (or you need to pay someone to do it). The capabilities of the R system generally
reflect the interests of the R user community. As the community has ballooned in size over the past
10 years, the capabilities have similarly increased. When I first started using R, there was very little
in the way of functionality for the physical sciences (physics, astronomy, etc.). However, now some
of those communities have adopted R and we are seeing more code being written for those kinds of
applications.
If you want to know my general views on the usefulness of R, you can see them here in the following
exchange on the R-help mailing list with Douglas Bates and Brian Ripley in June 2004:
¹⁵http://bioconductor.org
History and Overview of R 10

Roger D. Peng: I don’t think anyone actually believes that R is designed to make
everyone happy. For me, R does about 99% of the things I need to do, but sadly, when I
need to order a pizza, I still have to pick up the telephone.

Douglas Bates: There are several chains of pizzerias in the U.S. that provide for Internet-
based ordering (e.g. www.papajohnsonline.com) so, with the Internet modules in R, it’s
only a matter of time before you will have a pizza-ordering function available.

Brian D. Ripley: Indeed, the GraphApp toolkit (used for the RGui interface under R for
Windows, but Guido forgot to include it) provides one (for use in Sydney, Australia, we
presume as that is where the GraphApp author hails from). Alternatively, a Padovian
has no need of ordering pizzas with both home and neighbourhood restaurants ….

At this point in time, I think it would be fairly straightforward to build a pizza ordering R package
using something like the RCurl or httr packages. Any takers?

3.9 R Resources

Official Manuals
As far as getting started with R by reading stuff, there is of course this book. Also, available from
CRAN¹⁶ are

• An Introduction to R¹⁷
• R Data Import/Export¹⁸
• Writing R Extensions¹⁹: Discusses how to write and organize R packages
• R Installation and Administration²⁰: This is mostly for building R from the source code)
• R Internals²¹: This manual describes the low level structure of R and is primarily for developers
and R core members
• R Language Definition²²: This documents the R language and, again, is primarily for develop-
ers
¹⁶http://cran.r-project.org
¹⁷http://cran.r-project.org/doc/manuals/r-release/R-intro.html
¹⁸http://cran.r-project.org/doc/manuals/r-release/R-data.html
¹⁹http://cran.r-project.org/doc/manuals/r-release/R-exts.html
²⁰http://cran.r-project.org/doc/manuals/r-release/R-admin.html
²¹http://cran.r-project.org/doc/manuals/r-release/R-ints.html
²²http://cran.r-project.org/doc/manuals/r-release/R-lang.html
History and Overview of R 11

Useful Standard Texts on S and R


• Chambers (2008). Software for Data Analysis, Springer
• Chambers (1998). Programming with Data, Springer: This book is not about R, but it describes
the organization and philosophy of the current version of the S language, and is a useful
reference.
• Venables & Ripley (2002). Modern Applied Statistics with S, Springer: This is a standard
textbook in statistics and describes how to use many statistical methods in R. This book has
an associated R package (the MASS package) that comes with every installation of R.
• Venables & Ripley (2000). S Programming, Springer: This book is a little old but is still relevant
and accurate. Despite its title, this book is useful for R also.
• Murrell (2005). R Graphics, Chapman & Hall/CRC Press: Paul Murrell wrote and designed
much of the graphics system in R and this book essentially documents the underlying details.
This is not so much a “user-level” book as a developer-level book. But it is an important book
for anyone interested in designing new types of graphics or visualizations.
• Wickham (2014). Advanced R, Chapman & Hall/CRC Press: This book by Hadley Wickham
covers a number of areas including object-oriented programming, functional programming,
profiling and other advanced topics.

Other Resources
• Major technical publishers like Springer, Chapman & Hall/CRC have entire series of books
dedicated to using R in various applications. For example, Springer has a series of books called
Use R!.
• A longer list of books can be found on the CRAN web site²³.

²³http://www.r-project.org/doc/bib/R-books.html
4. Getting Started with R
4.1 Installation
The first thing you need to do to get started with R is to install it on your computer. R works on
pretty much every platform available, including the widely available Windows, Mac OS X, and Linux
systems. If you want to watch a step-by-step tutorial on how to install R for Mac or Windows, you
can watch these videos:

• Installing R on Windows¹
• Installing R on the Mac²

There is also an integrated development environment available for R that is built by RStudio. I really
like this IDE—it has a nice editor with syntax highlighting, there is an R object viewer, and there are
a number of other nice features that are integrated. You can see how to install RStudio here

• Installing RStudio³

The RStudio IDE is available from RStudio’s web site⁴.

4.2 Getting started with the R interface


After you install R you will need to launch it and start writing R code. Before we get to exactly how
to write R code, it’s useful to get a sense of how the system is organized. In these two videos I talk
about where to write code and how set your working directory, which let’s R know where to find
all of your files.

• Writing code and setting your working directory on the Mac⁵


• Writing code and setting your working directory on Windows⁶

¹http://youtu.be/Ohnk9hcxf9M
²https://youtu.be/uxuuWXU-7UQ
³https://youtu.be/bM7Sfz-LADM
⁴http://rstudio.com
⁵https://youtu.be/8xT3hmJQskU
⁶https://youtu.be/XBcvH1BpIBo

12
5. R Nuts and Bolts
5.1 Entering Input
At the R prompt we type expressions. The <- symbol is the assignment operator.

> x <- 1
> print(x)
[1] 1
> x
[1] 1
> msg <- "hello"

The grammar of the language determines whether an expression is complete or not.

x <- ## Incomplete expression

The # character indicates a comment. Anything to the right of the # (including the # itself) is ignored.
This is the only comment character in R. Unlike some other languages, R does not support multi-line
comments or comment blocks.

5.2 Evaluation
When a complete expression is entered at the prompt, it is evaluated and the result of the evaluated
expression is returned. The result may be auto-printed.

> x <- 5 ## nothing printed


> x ## auto-printing occurs
[1] 5
> print(x) ## explicit printing
[1] 5

The [1] shown in the output indicates that x is a vector and 5 is its first element.
Typically with interactive work, we do not explicitly print objects with the print function; it is much
easier to just auto-print them by typing the name of the object and hitting return/enter. However,
when writing scripts, functions, or longer programs, there is sometimes a need to explicitly print
objects because auto-printing does not work in those settings.
When an R vector is printed you will notice that an index for the vector is printed in square brackets
[] on the side. For example, see this integer sequence of length 20.

13
R Nuts and Bolts 14

> x <- 10:30


> x
[1] 10 11 12 13 14 15 16 17 18 19 20 21
[13] 22 23 24 25 26 27 28 29 30

The numbers in the square brackets are not part of the vector itself, they are merely part of the
printed output.
With R, it’s important that one understand that there is a difference between the actual R object
and the manner in which that R object is printed to the console. Often, the printed output may have
additional bells and whistles to make the output more friendly to the users. However, these bells and
whistles are not inherently part of the object.
Note that the : operator is used to create integer sequences.

5.3 R Objects
R has five basic or “atomic” classes of objects:

• character
• numeric (real numbers)
• integer
• complex
• logical (True/False)

The most basic type of R object is a vector. Empty vectors can be created with the vector() function.
There is really only one rule about vectors in R, which is that A vector can only contain objects
of the same class.
But of course, like any good rule, there is an exception, which is a list, which we will get to a bit later.
A list is represented as a vector but can contain objects of different classes. Indeed, that’s usually
why we use them.
There is also a class for “raw” objects, but they are not commonly used directly in data analysis and
I won’t cover them here.

5.4 Numbers
Numbers in R are generally treated as numeric objects (i.e. double precision real numbers). This
means that even if you see a number like “1” or “2” in R, which you might think of as integers, they
are likely represented behind the scenes as numeric objects (so something like “1.00” or “2.00”). This
isn’t important most of the time…except when it is.
R Nuts and Bolts 15

If you explicitly want an integer, you need to specify the L suffix. So entering 1 in R gives you a
numeric object; entering 1L explicitly gives you an integer object.
There is also a special number Inf which represents infinity. This allows us to represent entities like
1 / 0. This way, Inf can be used in ordinary calculations; e.g. 1 / Inf is 0.

The value NaN represents an undefined value (“not a number”); e.g. 0 / 0; NaN can also be thought of
as a missing value (more on that later)

5.5 Attributes
R objects can have attributes, which are like metadata for the object. These metadata can be very
useful in that they help to describe the object. For example, column names on a data frame help to
tell us what data are contained in each of the columns. Some examples of R object attributes are

• names, dimnames
• dimensions (e.g. matrices, arrays)
• class (e.g. integer, numeric)
• length
• other user-defined attributes/metadata

Attributes of an object (if any) can be accessed using the attributes() function. Not all R objects
contain attributes, in which case the attributes() function returns NULL.

5.6 Creating Vectors


The c() function can be used to create vectors of objects by concatenating things together.

> x <- c(0.5, 0.6) ## numeric


> x <- c(TRUE, FALSE) ## logical
> x <- c(T, F) ## logical
> x <- c("a", "b", "c") ## character
> x <- 9:29 ## integer
> x <- c(1+0i, 2+4i) ## complex

Note that in the above example, T and F are short-hand ways to specify TRUE and FALSE. However,
in general one should try to use the explicit TRUE and FALSE values when indicating logical values.
The T and F values are primarily there for when you’re feeling lazy.
You can also use the vector() function to initialize vectors.
R Nuts and Bolts 16

> x <- vector("numeric", length = 10)


> x
[1] 0 0 0 0 0 0 0 0 0 0

5.7 Mixing Objects


There are occasions when different classes of R objects get mixed together. Sometimes this happens
by accident but it can also happen on purpose. So what happens with the following code?

> y <- c(1.7, "a") ## character


> y <- c(TRUE, 2) ## numeric
> y <- c("a", TRUE) ## character

In each case above, we are mixing objects of two different classes in a vector. But remember that
the only rule about vectors says this is not allowed. When different objects are mixed in a vector,
coercion occurs so that every element in the vector is of the same class.
In the example above, we see the effect of implicit coercion. What R tries to do is find a way to
represent all of the objects in the vector in a reasonable fashion. Sometimes this does exactly what
you want and…sometimes not. For example, combining a numeric object with a character object
will create a character vector, because numbers can usually be easily represented as strings.

5.8 Explicit Coercion


Objects can be explicitly coerced from one class to another using the as.* functions, if available.

> x <- 0:6


> class(x)
[1] "integer"
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] "0" "1" "2" "3" "4" "5" "6"

Sometimes, R can’t figure out how to coerce an object and this can result in NAs being produced.
R Nuts and Bolts 17

> x <- c("a", "b", "c")


> as.numeric(x)
Warning: NAs introduced by coercion
[1] NA NA NA
> as.logical(x)
[1] NA NA NA
> as.complex(x)
Warning: NAs introduced by coercion
[1] NA NA NA

When nonsensical coercion takes place, you will usually get a warning from R.

5.9 Matrices
Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector
of length 2 (number of rows, number of columns)

> m <- matrix(nrow = 2, ncol = 3)


> m
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
> dim(m)
[1] 2 3
> attributes(m)
$dim
[1] 2 3

Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner
and running down the columns.

> m <- matrix(1:6, nrow = 2, ncol = 3)


> m
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6

Matrices can also be created directly from vectors by adding a dimension attribute.
R Nuts and Bolts 18

> m <- 1:10


> m
[1] 1 2 3 4 5 6 7 8 9 10
> dim(m) <- c(2, 5)
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10

Matrices can be created by column-binding or row-binding with the cbind() and rbind() functions.

> x <- 1:3


> y <- 10:12
> cbind(x, y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x, y)
[,1] [,2] [,3]
x 1 2 3
y 10 11 12

5.10 Lists
Lists are a special type of vector that can contain elements of different classes. Lists are a very
important data type in R and you should get to know them well. Lists, in combination with the
various “apply” functions discussed later, make for a powerful combination.
Lists can be explicitly created using the list() function, which takes an arbitrary number of
arguments.

> x <- list(1, "a", TRUE, 1 + 4i)


> x
[[1]]
[1] 1

[[2]]
[1] "a"

[[3]]
[1] TRUE

[[4]]
[1] 1+4i
R Nuts and Bolts 19

We can also create an empty list of a prespecified length with the vector() function

> x <- vector("list", length = 5)


> x
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

5.11 Factors
Factors are used to represent categorical data and can be unordered or ordered. One can think of
a factor as an integer vector where each integer has a label. Factors are important in statistical
modeling and are treated specially by modelling functions like lm() and glm().
Using factors with labels is better than using integers because factors are self-describing. Having a
variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.
Factor objects can be created with the factor() function.

> x <- factor(c("yes", "yes", "no", "yes", "no"))


> x
[1] yes yes no yes no
Levels: no yes
> table(x)
x
no yes
2 3
> ## See the underlying representation of factor
> unclass(x)
[1] 2 2 1 2 1
attr(,"levels")
[1] "no" "yes"
R Nuts and Bolts 20

Often factors will be automatically created for you when you read a dataset in using a function like
read.table(). Those functions often default to creating factors when they encounter data that look
like characters or strings.
The order of the levels of a factor can be set using the levels argument to factor(). This can be
important in linear modelling because the first level is used as the baseline level.

> x <- factor(c("yes", "yes", "no", "yes", "no"))


> x ## Levels are put in alphabetical order
[1] yes yes no yes no
Levels: no yes
> x <- factor(c("yes", "yes", "no", "yes", "no"),
+ levels = c("yes", "no"))
> x
[1] yes yes no yes no
Levels: yes no

5.12 Missing Values


Missing values are denoted by NA or NaN for q undefined mathematical operations.

• is.na() is used to test objects if they are NA


• is.nan() is used to test for NaN
• NA values have a class also, so there are integer NA, character NA, etc.
• A NaN value is also NA but the converse is not true

> ## Create a vector with NAs in it


> x <- c(1, 2, NA, 10, 3)
> ## Return a logical vector indicating which elements are NA
> is.na(x)
[1] FALSE FALSE TRUE FALSE FALSE
> ## Return a logical vector indicating which elements are NaN
> is.nan(x)
[1] FALSE FALSE FALSE FALSE FALSE
R Nuts and Bolts 21

> ## Now create a vector with both NA and NaN values


> x <- c(1, 2, NaN, NA, 4)
> is.na(x)
[1] FALSE FALSE TRUE TRUE FALSE
> is.nan(x)
[1] FALSE FALSE TRUE FALSE FALSE

5.13 Data Frames


Data frames are used to store tabular data in R. They are an important type of object in R and
are used in a variety of statistical modeling applications. Hadley Wickham’s package dplyr¹ has an
optimized set of functions designed to work efficiently with data frames.
Data frames are represented as a special type of list where every element of the list has to have the
same length. Each element of the list can be thought of as a column and the length of each element
of the list is the number of rows.
Unlike matrices, data frames can store different classes of objects in each column. Matrices must
have every element be the same class (e.g. all integers or all numeric).
In addition to column names, indicating the names of the variables or predictors, data frames have
a special attribute called row.names which indicate information about each row of the data frame.
Data frames are usually created by reading in a dataset using the read.table() or read.csv().
However, data frames can also be created explicitly with the data.frame() function or they can be
coerced from other types of objects like lists.
Data frames can be converted to a matrix by calling data.matrix(). While it might seem that the
as.matrix() function should be used to coerce a data frame to a matrix, almost always, what you
want is the result of data.matrix().

> x <- data.frame(foo = 1:4, bar = c(T, T, F, F))


> x
foo bar
1 1 TRUE
2 2 TRUE
3 3 FALSE
4 4 FALSE
> nrow(x)
[1] 4
> ncol(x)
[1] 2

¹https://github.com/hadley/dplyr
R Nuts and Bolts 22

5.14 Names
R objects can have names, which is very useful for writing readable code and self-describing objects.
Here is an example of assigning names to an integer vector.

> x <- 1:3


> names(x)
NULL
> names(x) <- c("New York", "Seattle", "Los Angeles")
> x
New York Seattle Los Angeles
1 2 3
> names(x)
[1] "New York" "Seattle" "Los Angeles"

Lists can also have names, which is often very useful.

> x <- list("Los Angeles" = 1, Boston = 2, London = 3)


> x
$`Los Angeles`
[1] 1

$Boston
[1] 2

$London
[1] 3
> names(x)
[1] "Los Angeles" "Boston" "London"

Matrices can have both column and row names.

> m <- matrix(1:4, nrow = 2, ncol = 2)


> dimnames(m) <- list(c("a", "b"), c("c", "d"))
> m
c d
a 1 3
b 2 4

Column names and row names can be set separately using the colnames() and rownames()
functions.
R Nuts and Bolts 23

> colnames(m) <- c("h", "f")


> rownames(m) <- c("x", "z")
> m
h f
x 1 3
z 2 4

Note that for data frames, there is a separate function for setting the row names, the row.names()
function. Also, data frames do not have column names, they just have names (like lists). So to set
the column names of a data frame just use the names() function. Yes, I know its confusing. Here’s a
quick summary:

Object Set column names Set row names


data frame names() row.names()
matrix colnames() rownames()

5.15 Summary
There are a variety of different builtin-data types in R. In this chapter we have reviewed the following

• atomic classes: numeric, logical, character, integer, complex


• vectors, lists
• factors
• missing values
• data frames and matrices

All R objects can have attributes that help to describe what is in the object. Perhaps the most useful
attribute is names, such as column and row names in a data frame, or simply names in a vector or
list. Attributes like dimensions are also important as they can modify the behavior of objects, like
turning a vector into a matrix.
6. Getting Data In and Out of R
6.1 Reading and Writing Data
Watch a video of this section¹
There are a few principal functions reading data into R.

• read.table, read.csv, for reading tabular data


• readLines, for reading lines of a text file
• source, for reading in R code files (inverse of dump)
• dget, for reading in R code files (inverse of dput)
• load, for reading in saved workspaces
• unserialize, for reading single R objects in binary form

There are of course, many R packages that have been developed to read in all kinds of other datasets,
and you may need to resort to one of these packages if you are working in a specific area.
There are analogous functions for writing data to files

• write.table, for writing tabular data to text files (i.e. CSV) or connections
• writeLines, for writing character data line-by-line to a file or connection
• dump, for dumping a textual representation of multiple R objects
• dput, for outputting a textual representation of an R object
• save, for saving an arbitrary number of R objects in binary format (possibly compressed) to
a file.
• serialize, for converting an R object into a binary format for outputting to a connection (or
file).

6.2 Reading Data Files with read.table()


The read.table() function is one of the most commonly used functions for reading data. The help
file for read.table() is worth reading in its entirety if only because the function gets used a lot
(run ?read.table in R). I know, I know, everyone always says to read the help file, but this one is
actually worth reading.
The read.table() function has a few important arguments:
¹https://youtu.be/Z_dc_FADyi4

24
Getting Data In and Out of R 25

• file, the name of a file, or a connection


• header, logical indicating if the file has a header line
• sep, a string indicating how the columns are separated
• colClasses, a character vector indicating the class of each column in the dataset
• nrows, the number of rows in the dataset. By default read.table() reads an entire file.
• comment.char, a character string indicating the comment character. This defalts to "#". If there
are no commented lines in your file, it’s worth setting this to be the empty string "".
• skip, the number of lines to skip from the beginning
• stringsAsFactors, should character variables be coded as factors? This defaults to TRUE
because back in the old days, if you had data that were stored as strings, it was because
those strings represented levels of a categorical variable. Now we have lots of data that is text
data and they don’t always represent categorical variables. So you may want to set this to
be FALSE in those cases. If you always want this to be FALSE, you can set a global option via
options(stringsAsFactors = FALSE). I’ve never seen so much heat generated on discussion
forums about an R function argument than the stringsAsFactors argument. Seriously.

For small to moderately sized datasets, you can usually call read.table without specifying any other
arguments

> data <- read.table("foo.txt")

In this case, R will automatically

• skip lines that begin with a #


• figure out how many rows there are (and how much memory needs to be allocated)
• figure what type of variable is in each column of the table.

Telling R all these things directly makes R run faster and more efficiently. The read.csv() function
is identical to read.table except that some of the defaults are set differently (like the sep argument).

6.3 Reading in Larger Datasets with read.table


Watch a video of this section²
With much larger datasets, there are a few things that you can do that will make your life easier and
will prevent R from choking.

• Read the help page for read.table, which contains many hints
²https://youtu.be/BJYYIJO3UFI
Getting Data In and Out of R 26

• Make a rough calculation of the memory required to store your dataset (see the next section
for an example of how to do this). If the dataset is larger than the amount of RAM on your
computer, you can probably stop right here.
• Set comment.char = "" if there are no commented lines in your file.
• Use the colClasses argument. Specifying this option instead of using the default can make
’read.table’ run MUCH faster, often twice as fast. In order to use this option, you have to know
the class of each column in your data frame. If all of the columns are “numeric”, for example,
then you can just set colClasses = "numeric". A quick an dirty way to figure out the classes
of each column is the following:

> initial <- read.table("datatable.txt", nrows = 100)


> classes <- sapply(initial, class)
> tabAll <- read.table("datatable.txt", colClasses = classes)

• Set nrows. This doesn’t make R run faster but it helps with memory usage. A mild overestimate
is okay. You can use the Unix tool wc to calculate the number of lines in a file.

In general, when using R with larger datasets, it’s also useful to know a few things about your
system.

• How much memory is available on your system?


• What other applications are in use? Can you close any of them?
• Are there other users logged into the same system?
• What operating system ar you using? Some operating systems can limit the amount of memory
a single process can access

6.4 Calculating Memory Requirements for R Objects


Because R stores all of its objects physical memory, it is important to be cognizant of how much
memory is being used up by all of the data objects residing in your workspace. One situation where
it’s particularly important to understand memory requirements is when you are reading in a new
dataset into R. Fortunately, it’s easy to make a back of the envelope calculation of how much memory
will be required by a new dataset.
For example, suppose I have a data frame with 1,500,000 rows and 120 columns, all of which are
numeric data. Roughly, how much memory is required to store this data frame? Well, on most
modern computers double precision floating point numbers³ are stored using 64 bits of memory, or
8 bytes. Given that information, you can do the following calculation

³http://en.wikipedia.org/wiki/Double-precision_floating-point_format
Getting Data In and Out of R 27

1,500,000 × 120 × 8 bytes/numeric = 1,440,000,000 bytes


= 1,440,000,000 / 220 bytes/MB
= 1,373.29 MB
= 1.34 GB

So the dataset would require about 1.34 GB of RAM. Most computers these days have at least that
much RAM. However, you need to be aware of

• what other programs might be running on your computer, using up RAM


• what other R objects might already be taking up RAM in your workspace

Reading in a large dataset for which you do not have enough RAM is one easy way to freeze up your
computer (or at least your R session). This is usually an unpleasant experience that usually requires
you to kill the R process, in the best case scenario, or reboot your computer, in the worst case. So
make sure to do a rough calculation of memeory requirements before reading in a large dataset.
You’ll thank me later.
7. Using the readr Package
The readr package is recently developed by Hadley Wickham to deal with reading in large flat
files quickly. The package provides replacements for functions like read.table() and read.csv().
The analogous functions in readr are read_table() and read_csv(). This functions are oven much
faster than their base R analogues and provide a few other nice features such as progress meters.
For the most part, you can read use read_table() and read_csv() pretty much anywhere you might
use read.table() and read.csv(). In addition, if there are non-fatal problems that occur while
reading in the data, you will get a warning and the returned data frame will have some information
about which rows/observations triggered the warning. This can be very helpful for “debugging”
problems with your data before you get neck deep in data analysis.

28
8. Using Textual and Binary Formats
for Storing Data
Watch a video of this chapter¹
There are a variety of ways that data can be stored, including structured text files like CSV or tab-
delimited, or more complex binary formats. However, there is an intermediate format that is textual,
but not as simple as something like CSV. The format is native to R and is somewhat readable because
of its textual nature.
One can create a more descriptive representation of an R object by using the dput() or dump()
functions. The dump() and dput() functions are useful because the resulting textual format is edit-
able, and in the case of corruption, potentially recoverable. Unlike writing out a table or CSV file,
dump() and dput() preserve the metadata (sacrificing some readability), so that another user doesn’t
have to specify it all over again. For example, we can preserve the class of each column of a table or
the levels of a factor variable.
Textual formats can work much better with version control programs like subversion or git which
can only track changes meaningfully in text files. In addition, textual formats can be longer-lived;
if there is corruption somewhere in the file, it can be easier to fix the problem because one can just
open the file in an editor and look at it (although this would probably only be done in a worst case
scenario!). Finally, textual formats adhere to the Unix philosophy², if that means anything to you.
There are a few downsides to using these intermediate textual formats. The format is not very space-
efficient, because all of the metadata is specified. Also, it is really only partially readable. In some
instances it might be preferable to have data stored in a CSV file and then have a separate code file
that specifies the metadata.

8.1 Using dput() and dump()


One way to pass data around is by deparsing the R object with dput() and reading it back in (parsing
it) using dget().

¹https://youtu.be/5mIPigbNDfk
²http://www.catb.org/esr/writings/taoup/

29
Using Textual and Binary Formats for Storing Data 30

> ## Create a data frame


> y <- data.frame(a = 1, b = "a")
> ## Print 'dput' output to console
> dput(y)
structure(list(a = 1, b = structure(1L, .Label = "a", class = "factor")), .Names = c("a",
"b"), row.names = c(NA, -1L), class = "data.frame")

Notice that the dput() output is in the form of R code and that it preserves metadata like the class
of the object, the row names, and the column names.
The output of dput() can also be saved directly to a file.

> ## Send 'dput' output to a file


> dput(y, file = "y.R")
> ## Read in 'dput' output from a file
> new.y <- dget("y.R")
> new.y
a b
1 1 a

Multiple objects can be deparsed at once using the dump function and read back in using source.

> x <- "foo"


> y <- data.frame(a = 1L, b = "a")

We can dump() R objects to a file by passing a character vector of their names.

> dump(c("x", "y"), file = "data.R")


> rm(x, y)

The inverse of dump() is source().

> source("data.R")
> str(y)
'data.frame': 1 obs. of 2 variables:
$ a: int 1
$ b: Factor w/ 1 level "a": 1
> x
[1] "foo"
Using Textual and Binary Formats for Storing Data 31

8.2 Binary Formats


The complement to the textual format is the binary format, which is sometimes necessary to use
for efficiency purposes, or because there’s just no useful way to represent data in a textual manner.
Also, with numeric data, one can often lose precision when converting to and from a textual format,
so it’s better to stick with a binary format.
The key functions for converting R objects into a binary format are save(), save.image(), and
serialize(). Individual R objects can be saved to a file using the save() function.

> a <- data.frame(x = rnorm(100), y = runif(100))


> b <- c(3, 4.4, 1 / 3)
>
> ## Save 'a' and 'b' to a file
> save(a, b, file = "mydata.rda")
>
> ## Load 'a' and 'b' into your workspace
> load("mydata.rda")

If you have a lot of objects that you want to save to a file, you can save all objects in your workspace
using the save.image() function.

> ## Save everything to a file


> save.image(file = "mydata.RData")
>
> ## load all objects in this file
> load("mydata.RData")

Notice that I’ve used the .rda extension when using save() and the .RData extension when using
save.image(). This is just my personal preference; you can use whatever file extension you want.
The save() and save.image() functions do not care. However, .rda and .RData are fairly common
extensions and you may want to use them because they are recognized by other software.
The serialize() function is used to convert individual R objects into a binary format that can be
communicated across an arbitrary connection. This may get sent to a file, but it could get sent over
a network or other connection.
When you call serialize() on an R object, the output will be a raw vector coded in hexadecimal
format.
Using Textual and Binary Formats for Storing Data 32

> x <- list(1, 2, 3)


> serialize(x, NULL)
[1] 58 0a 00 00 00 02 00 03 02 03 00 02 03 00 00 00 00 13 00 00 00 03 00
[24] 00 00 0e 00 00 00 01 3f f0 00 00 00 00 00 00 00 00 00 0e 00 00 00 01
[47] 40 00 00 00 00 00 00 00 00 00 00 0e 00 00 00 01 40 08 00 00 00 00 00
[70] 00

If you want, this can be sent to a file, but in that case you are better off using something like save().
The benefit of the serialize() function is that it is the only way to perfectly represent an R object
in an exportable format, without losing precision or any metadata. If that is what you need, then
serialize() is the function for you.
9. Interfaces to the Outside World
Watch a video of this chapter¹
Data are read in using connection interfaces. Connections can be made to files (most common) or to
other more exotic things.

• file, opens a connection to a file


• gzfile, opens a connection to a file compressed with gzip
• bzfile, opens a connection to a file compressed with bzip2
• url, opens a connection to a webpage

In general, connections are powerful tools that let you navigate files or other external objects.
Connections can be thought of as a translator that lets you talk to objects that are outside of R.
Those outside objects could be anything from a data base, a simple text file, or a a web service API.
Connections allow R functions to talk to all these different external objects without you having to
write custom code for each object.

9.1 File Connections


Connections to text files can be created with the file() function.

> str(file)
function (description = "", open = "", blocking = TRUE, encoding = getOption("encoding"),
raw = FALSE)

The file() function has a number of arguments that are common to many other connection
functions so it’s worth going into a little detail here.

• description is the name of the file


• open is a code indicating what mode the file should be opened in

The open argument allows for the following options:

• “r” open file in read only mode


¹https://youtu.be/Pb01WoJRUtY

33
Other documents randomly have
different content
The Project Gutenberg eBook of Rifles and
Riflemen at the Battle of Kings Mountain
This ebook is for the use of anyone anywhere in the United States
and most other parts of the world at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
ebook or online at www.gutenberg.org. If you are not located in the
United States, you will have to check the laws of the country where
you are located before using this eBook.

Title: Rifles and Riflemen at the Battle of Kings Mountain

Author: United States. National Park Service

Contributor: Alfred F. Hopkins


Carl Parcher Russell
Rogers W. Young

Release date: May 31, 2018 [eBook #57246]

Language: English

Credits: Produced by Stephen Hutcheson and the Online Distributed


Proofreading Team at http://www.pgdp.net

*** START OF THE PROJECT GUTENBERG EBOOK RIFLES AND


RIFLEMEN AT THE BATTLE OF KINGS MOUNTAIN ***
NATIONAL PARK SERVICE
POPULAR STUDY SERIES

History No. 12

Rifles and Riflemen


at the
Battle of Kings Mountain

UNITED STATES DEPARTMENT OF THE INTERIOR, J. A. KRUG,


Secretary

NATIONAL PARK SERVICE, NEWTON B. DRURY, Director

Reprinted 1947
CONTENTS
Page
Kings Mountain, A Hunting Rifle Victory 1
The American Rifle at the Battle of Kings Mountain 8
Testing the Ferguson Rifle—Modern Marksman Attains High Precision
With Arm of 1776
19

For sale by the Superintendent of Documents, U. S. Government


Printing Office, Washington 25, D. C.—Price 15 cents

Maj. Patrick Ferguson, British commander at the Battle of Kings Mountain,


and inventor of the breechloading rifle bearing his name; from a marble
bust.

1
Kings Mountain
[1]
A Hunting Rifle Victory

By Roger W. Young, Historian


Branch of History

Kings Mountain, the fierce attack of American frontiersmen on


October 7, 1780, against Cornwallis’ scouting force under Ferguson,
was an unexpected onslaught carried out in the foothills of South
Carolina. This sudden uprising of the stalwart Alleghany
mountaineers, for the protection of their homes and people from the
threat of Tory invasion under British leadership, was relatively isolated
in conception and execution from the main course of the
Revolutionary War in the South.

Clearly uncontemplated in the grand British design to subjugate the


South in a final effort to end the Revolution, this accidental encounter
in the Southern Piedmont delayed incidentally, but did not alter
materially, the movement of Britain’s Southern Campaign. Kings
Mountain is notable chiefly perhaps as supplying the first definite
forewarning of the impending British military disasters of 1781. It was
decisive to the extent that it contributed the earliest distinct element
of defeat to the final major British campaign of the Revolution.

The extraordinary action occurred during one of the bleakest periods


of the Revolution. A major change in British military strategy had
again shifted the scene of action to the South in 1778. Faced by a
discouraging campaign in the North and assuming that the reputed
Loyalist sympathies of the South would be more conducive to a
victory there, the British war ministry had dictated the immediate
subjugation of the South. With the conquered Southern provinces as
a base of operations, the war office planned to crush 2
Washington’s armies in the North and East between offensives
from North and South, and thus bring the defeat of the more
stubborn Revolutionary Northern colonies.

Unimpeded by effective resistance, this Southern Campaign swept


unchecked through Georgia and part of South Carolina during 1778-
79. The surrender of Gen. Benjamin Lincoln’s American army at
Charleston, in May 1780, greatly strengthened the British hold on
South Carolina. Encouraged by the British successes, the Royalist and
Tory elements of the Georgia and South Carolina lowlands rose in
increasingly large numbers to the support of the Royal cause. Soon
most of South Carolina, except a few districts in the Piedmont, were
overrun by British and Royalist forces directed by Cornwallis, and he
was maturing plans for the invasion of North Carolina. His designs
were upset temporarily by the advance of a new American Army
under Gates. Meeting Cornwallis near Camden, August 16, 1780,
Gates suffered a disastrous defeat, again leaving South Carolina and
the route northward open to the British. By September, Cornwallis
again had undertaken the invasion of North Carolina, gaining a
foothold at Charlotte, a center of Whig power, after a skirmish there
late that month.

The sole Southern region in the path of Cornwallis’ northward march


which had remained undisturbed by the course of the war lay in the
foothills and ranges of the Alleghanies stretching through
northwestern South Carolina, western North Carolina, and into the
present eastern Tennessee. Only here, among the frontier
settlements of the independent mountain yeoman, could the patriotic
Whigs find refuge, late in the summer of 1780, from their despised
enemies, the propertied Royalist and Tory forces aroused by
Cornwallis. Occupied with establishing a new frontier and protecting
their rude homes from the nearer threat of the border Indians, the
mountain men had been little concerned with the war on the
seaboard. The influx of partisan Whig forces seeking sanctuary first
brought the effects of war vividly before them. But from the free and
comparatively peaceful existence, the backwoodsmen were soon to
be aroused to the protection of their homes and possessions by a
threat of direct aggression.

Only a few of the original Ferguson rifles are extant. The one shown is
exhibited at Kings Mountain National Military Park, South Carolina. Here
we see the profile of the piece with an 18-inch ruler to indicate scale.

That threat came from Maj. Patrick Ferguson, of Cornwallis’ 4


command, who, after Camden, had been ordered to operate in
the South Carolina Piedmont to suppress the Whig opposition
remaining there and to arouse the back country Tories, organizing
their strength in support of the British cause. Encountering little
organized Whig resistance, and having rapidly perfected the Tory
strength in the Piedmont, Ferguson in September 1780 undertook a
foray against Gilbert Town, a Whig outpost in North Carolina, near
the present town of Rutherfordton. Fearful of such an invasion, the
border leaders, Isaac Shelby, of Sullivan County, and John Sevier, of
Washington County, North Carolina (both now in Tennessee), had
hurried to the Watauga settlements and called for volunteers to
defeat Ferguson. They also forwarded urgent appeals for aid to
Wilkes, Surry, Burke, and Rutherford Counties in North Carolina, and
to Washington County in Virginia.

From Gilbert Town, early in September, Ferguson dispatched his


famed invidious threat over the mountains to the backwoodsmen,
warning them “that if they did not desist from their opposition to the
British arms and take protection under his standard, he would march
his army over the mountains, hang their leaders, and lay their
country waste with fire and sword.” Actually this was but an empty
gesture from Ferguson who was then preparing one final foray across
the border in South Carolina before making a junction with Cornwallis
at Charlotte. Yet, to the freedom-loving frontier leaders the threat
became a challenge which strengthened their determination to
destroy the invader. Thus spurred, they assembled quickly, each in
hunting garb, with knapsack, blanket, and long hunting rifle, most of
them mounted, but some afoot. They were united by a strong resolve
to destroy Ferguson and his Tory force, even though they had many a
brother, cousin, or even a father among the back country men in 5
his command. In fact, the partisan and internecine warfare,
which raged during the Revolution through the southern highlands
and along the Piedmont with members of the same family arrayed
against each other as Whig and Tory, reached a climax in the Kings
Mountain expedition and engagement.

Assembling near the present Elizabethton, Tenn., late in September,


the mountaineers circled southeastward into upper South Carolina, in
swift pursuit of Ferguson. Joining the forces of Shelby and Sevier
were the Virginians under Campbell, and as the expedition marched
southward it was augmented by the border fighters under McDowell
and Cleveland. Though characterized by daring impulse, the purpose
of this strategic frontier uprising had been conceived coolly by these
leaders, and its execution, in pursuit and assault, was to be brilliantly
carried out. At the Cowpens in upper South Carolina, the expedition
was joined October 6 by further volunteers under local Whig leaders,
including Chronicle, Williams, Lacey, and Hawthorne. Recruits brought
definite word of Ferguson’s whereabouts near Kings Mountain. And
there, in a final council of war, were selected 910 stalwart fighting
men, all mounted, who immediately moved through the night upon
the position of Ferguson’s Provincial Corps and Tory militia, now
encamped atop the Kings Mountain spur.

Despite the added discomfort to their already fatigued bodies and


mounts, the expedition pushed determinedly through the cold night
rain, and en route the leaders, now commanded by Campbell,
devised a final plan of attack. Having agreed to surround the spur
and gradually to close in upon its defenders from all sides, the Whig
attackers engaged the 1,104 British Provincials, Tories, and Loyalists
at about 3 o’clock on the afternoon of October 7, 1780. In the
sanguinary one-hour engagement that ensued along the heavily
wooded and rocky slopes, the backwoodsmen, veterans of countless
border clashes even if untrained in formal warfare, gained a complete
victory, killing or capturing the entire British force. The most 6
illustrious casualty was, of course, Maj. Patrick Ferguson, the
British commander.

The extraordinary action is memorable primarily as an example of the


personal valor and resourcefulness of the American frontier fighter,
particularly the Scotch-Irish, during the Revolution. It demonstrated
the proficiency with which he took advantage of natural cover and
capitalized upon the ineffectiveness of the British downhill angle of
fire in successfully assaulting Ferguson’s position. The resulting
casualties clearly exhibited the unerring accuracy of the long rifle
used in skilled hands, even when confronted with the menace of
Ferguson’s bayonet charges. The engagement also afforded one of
the most interesting demonstrations during the Revolution of the use
of the novel breechloading Ferguson rifle. The Kings Mountain
expedition and engagement illustrated the characteristic vigor of the
untrained American frontiersman in rising to the threat of border
invasion. It recorded his military effectiveness in overcoming such a
danger and his initiative in disbanding quietly upon its passing,
especially when guided by strategy and tactics momentarily devised
by partisan leaders of the caliber of Shelby, Sevier, Campbell,
Cleveland, and Lacey.

To the long standing local strife between Whig and Tory, the results
of Kings Mountain were direct and considerable. It was an
unexpected blow which completely unnerved and undermined the
Loyalist organization in the Carolinas, and placed the downtrodden
Whig cause of the Piedmont in the ascendancy. Kings Mountain was a
climax to the social, economic, and military clashes between
democratic Whig and propertied Tory elements. In a sense it
epitomized this bitter struggle and its abrupt ending on what then
was the southwestern frontier. Heartening to the long repressed
Whigs, the engagement placed them in the control of the Piedmont,
and encouraged them to renewed resistance.

The disintegration of Loyalist power in the Carolinas after Kings


Mountain temporarily proved a real obstacle to Cornwallis’ 7
hitherto unchecked northward movement. The demoralization of
the Loyalist forces, which were the main reliance for local support in
the prosecution of his campaign, left Cornwallis precariously situated
in hostile North Carolina territory with a renewed Whig threat to the
rear in South Carolina. Momentarily discouraged, he halted his North
Carolina offensive and retired from his foothold at Charlotte to a
defensive position at Winnsboro, in upper South Carolina. Here he
remained inactive, with his campaign at a standstill, until the
approach of reinforcements at his rear, under Leslie, enabled him to
resume his invasion of North Carolina early in January 1781.

This time Cornwallis’ march was more cautious in its initial stages. For
the enforced delay of the major British advance occasioned by Kings
Mountain and lengthened by indecision, had enabled Greene, the
new American commander in the South, to reorganize his shattered
and dispirited army and launch a renewed and two-fold offensive
upon the main British movement. It was this offensive in 1781, which
first successfully struck the British at Cowpens, then rapidly withdrew
through the Piedmont, further dissipated Cornwallis’ energies at
Guilford Courthouse, and prepared the way for the American victory
at Yorktown.

By providing an unexpected American victory on the South Carolina


border, Kings Mountain prevented the immediate subjugation of the
Carolinas and temporarily deranged the British campaign to establish
a completely conquered southern base of operation. By producing a
feeling of patriotic success at the inception of the final major British
campaign, Kings Mountain contributed to the renewing of American
resistance which resulted in the British disasters of 1781.
8

The American Rifle


[2]
At the Battle of Kings Mountain

By C. P. Russell, Chief Naturalist


Branch of Natural History

Progress made on the new museum at Kings Mountain National


Military Park, South Carolina, is worthy of record, and the fact that
the Service possesses a Ferguson rifle to put into that museum
constitutes special note within the record. To the average park visitor
“Ferguson rifle” means little or nothing, but to the student of military
history mention of that British weapon kindles a flame of interest.
The story of how the Ferguson rifle was pitted against the Kentucky
rifle at Kings Mountain is significant in this day of rearmament.

Maj. Patrick Ferguson was born in 1744, the son of a Scottish jurist,
James Ferguson of Pitfour. At an early age he became an officer in
the Royal North British Dragoons, and by the time the American
colonists revolted against British rule he had distinguished himself in
service with the Scotch militia and as an expeditionist during the
Carib insurrection in the West Indies. In 1776 he demonstrated to
British Government officials a weapon of his own invention, “a rifle
gun on a new construction which astonished all beholders.”

9
BREECH MECHANISM OF THE FERGUSON RIFLE
Breech plug lowered by one turn of the trigger guard

The remarkable feature of the gun is its perpendicular breech plug


equipped with a screw device so as to make it possible to lower it by
a revolution of the trigger guard which serves as a handle. When the
breech plug is lowered, an opening is left in the top of the barrel at
the breech. A spherical bullet dropped into this opening with the
muzzle of the gun held downward rolls forward through the chamber
where it is stopped by the lands of the rifling. A charge of powder
then poured into the opening fills the chamber behind the bullet,
whereupon one revolution of the trigger guard closes the breech and
the weapon is ready for priming and firing. Major Ferguson
demonstrated that six aimed shots per minute could be fired with an
accuracy creditable to any rifle. Advancing riflemen could fire four
aimed shots per minute; reloading being possible while the
marksman was running. Another great advantage of the Ferguson
rifle was found in the fact that it could be loaded while the marksman
was reclining—something quite impossible with the American rifle. A
patent was granted for the Ferguson invention on December 2, 1776,
and the weapon became the first breechloader used by organized
troops of any country.

On September 11, 1777, Major Ferguson commanded the small 10


unit of picked riflemen of the British Army who covered the
advance of Knyphauser and his German mercenaries at Brandywine.
An American who knew nothing of breechloading rifles, but who was
possessed of the old dependable Kentucky rifle, put a bullet into
Ferguson’s right arm, shattering the elbow. The major’s arm was
useless thereafter and while he was recuperating Sir William Howe
jealously took advantage of his disability, disbanded Ferguson’s
riflemen, and put into storage the superior rifles which they had
carried. This did not terminate the service of Ferguson, nor did it
relegate his rifle to the discard. His command was restored, and he
again took the field with his handful of riflemen. At Stony Point, N. Y.,
and Little Egg Harbor, N. J., he came out on top in the fighting with
American privateers and the famous Pulaski Legion. Had Great Britain
manufactured more of the Ferguson rifles, perhaps he would have
gained further victories.

Sir Henry Clinton’s expedition of 1779 against Charleston, S. C., found


Ferguson and a comparatively few of his rifles active in the
depredations of several thousand Tories organized to terrorize the
rebellious colonists of the Carolinas. They invaded the interior and
operated on the very western border of the Carolinas. For 5 months
he held sway over the upcountry, enticing or intimidating the young
men of the region to enlist under the British flag. The local militia so
formed in the wild back country were drilled by him in the ways of
the British Army, and all other inhabitants, so far as possible, were
pledged to faithful Royal service. The patriots of the interior
settlements lay helpless. Any Carolinian found in arms against the
King might be—and many were—hanged for treason. Finally, a British
proclamation was issued requiring all inhabitants to take active part
on the royalist side, which but served to bring about a notable
uprising of the Whigs who, throughout the summer of 1780, engaged
in fierce guerilla warfare against the organized Tories.

11

German Jäger rifle, used in America during the Revolution, above; as


compared with the Kentucky rifle of the Revolutionary period, below.

Not only did the sparsely populated settlements on the 12


headwaters of the Catawba, Broad, and Pacolet Rivers
contribute to the force that opposed Ferguson, but the over-mountain
settlements on the Watauga and Holston likewise sent their
backwoodsmen, all of whom were well experienced in Indian warfare.
The routes followed by these parties on their way to the Kings
Mountain rendezvous cross the present Blue Ridge National Parkway
in a number of places.

The unmerciful treatment of Buford’s patriots at the hands of Tarleton


had engendered savage fury on the part of the Whigs which was as
bitterly reciprocated by the Tories. Utter refusal of quarter was usual
in many battles. In the Carolinas, hand-to-hand encounters were
common, and the contest became a war of ruthless extermination.
General Greene, writing of this condition, said: “The animosity
between the Whigs and Tories renders their situation truly
deplorable.... The Whigs seem determined to extirpate the Tories,
and the Tories the Whigs.... If a stop cannot be put to these
massacres, the country will be depopulated in a few months more, as
neither Whig nor Tory can live.”

In September 1780, while this spirit of hatred was at its height, the
regiments of backwoods patriots, who were to go down in history as
“Kings Mountain Men,” rendezvoused at South Mountain north of
Gilbert Town and determined to set upon Ferguson and his
command, then believed to be in Gilbert Town. The followers of the
Whig border leaders, Campbell, Shelby, Sevier, Cleveland, Lacey,
Williams, McDowell, Hambright, Hawthorne, Brandon, Chronicle, and
Hammond, descended upon Gilbert Town on October 4 only to find
that the Tories, apprised of the planned attack, had evacuated that
place; Ferguson was in full retreat in an attempt to evade an
engagement. His goal was Charlotte and the safety of the British
forces there stationed under Cornwallis. On October 6, 13
Ferguson was attracted from his line of march to the
commanding eminence, Kings Mountain, known at that time by the
famous name that we apply today. His 1,100 loyalists went into camp
on these heights, and Ferguson declared that “he was on Kings
Mountain, that he was King of that mountain, and God Almighty
could not drive him from it.” He took none of the ordinary military
precautions of forming breastworks, but merely placed his baggage
wagons along the northeastern part of the mountain to give some
slight appearance of protection in the neighborhood of his
headquarters.

The united backwoodsmen, led by Campbell, had pursued the fleeing


Tories from Gilbert Town. Spies sent forward obtained accurate
information on the numbers and intentions of the Tories. It became
evident to the Whig leaders that, if they were to overtake their quarry
before reinforcements sent by Cornwallis might join them, a more
speedy pursuit would be necessary. Accordingly, on the night of
October 5, the best men, horses, and equipment were selected for a
forced march. About 900 picked horsemen, all well armed with the
Kentucky rifle, traveled by way of Cowpens, S. C., marching
throughout the rainy night of October 6, crossed the swollen Broad
River at Cherokee Ford, and on the afternoon of October 7 came
upon the Loyalists on their supposed stronghold.

The story of the battle which ensued is one of the thrilling chapters in
our history. The Whigs surrounded the mountain and, in spite of a
few bayonet charges made by the Tories, pressed up the slopes and
poured into the Loyalist lines such deadly fire from the long rifles that
in less than an hour 225 had been killed, 163 wounded, and 716
made prisoners. Major Ferguson fell with eight bullets in his body.
The Whigs lost 28 killed and 62 wounded.

14
PERFORMANCE OF THE FERGUSON RIFLE
Six shots a minute
Efficient in any weather
Four shots a minute while advancing
THE FERGUSON RIFLE

Patrick Ferguson, the best shot in the British army, invented a


rifle in 1776 that loaded at the breech. It was the first
breechloader carried by the troops of any country.

The Provincial Regulars are believed to have used this


splendid weapon at Kings Mountain.

The rifle was ahead of its time and was discarded after his
death. It is now rare.

Probably no other battle in the Revolution was so picturesque 15


or so furiously fought as that at Kings Mountain. The very
mountain thundered. Not a regular soldier was in the American ranks.
Every man there was actuated by a spirit of democracy. They fought
under leaders of their own choosing for the right to live in a land
governed by men of their own choice.

With the death of Ferguson, the rifles of his invention, with which
probably 150 of his men were armed, disappeared. Some were
broken in the fight and others were carried off by the victors. One
given by Ferguson to his companion, De Peyster, is today an heirloom
in the family of the latter’s descendants in New York City. It was
exhibited by the United States Government at the World’s Fair at
Chicago in 1893. A very few are to be found in museum collections in
this country and in England. The one possessed by the National Park
Service was obtained from a dealer in England through the vigilance
of members of the staff of the Colonial National Historical Park,
Virginia, and is now exhibited in the museum at Kings Mountain
National Military Park, South Carolina.
The Kings Mountain museum tells the story of the Revolutionary
backwoodsman and his place in the scheme of Americanism. Here
also is presented the story of the cultural, social, and economic
background of the Kings Mountain patriots, as well as the details of
the battle and its effect on the Revolution as a whole. Here lies the
rare opportunity to preserve for all time significant relics of Colonial
and Revolutionary days and at the same time interpret for a
multitude of visitors the basic elements in the story of the old frontier
—a story which affected most of the Nation during the century that
followed the Revolution.

Our interest here will turn to those intriguing reminders of how our
Colonial ancestors lived—their houses, their tools and implements,
their furniture, their books, and their guns. Because of the
significance of the American rifle in the battle of Kings Mountain, it
must be a feature of any Kings Mountain exhibit. In the Carolinas it
was as much a part of each patriot as was his good right arm.

Light in weight, graceful in line, economical in consumption of 16


powder and lead, fatally precise, and distinctly American, it was
for 100 years the great arbitrator that settled all differences
throughout the American wilderness. George Washington, while a
surveyor in the back country, as scout and diplomat on his march into
the Ohio country, and while with his Virginians on Braddock’s fatal
expedition, had formed the acquaintance of the hunters, Indian
fighters, and pioneers of the Alleghanies—riflemen all. These men
were drawn upon in 1775 to form the first units of the United States
Army, 10 companies of “expert riflemen.” The British, in an attempt to
compete with American accuracy of fire, cried for Jäger, German
huntsmen armed with rifles, and begged that they might be included
in the contingents of German troops.

From the numerous written comments on the American rifle and


riflemen made by British leaders, it would be possible to quote at
length regarding the effect of American rifle fire upon British morale
and casualty lists. We may call attention again to the statistics on the
Kings Mountain dead: British, 225; American, 28. Draper records that
20 dead Tories were found behind certain protruding rocks on the
crest of the hill, and that each victim was marked by a bullet hole in
his forehead. Col. George Hanger, British officer with Tarleton in
South Carolina, provides the following observation on the precision of
American rifle fire:

I never in my life saw better rifles (or men who shot better) than
those made in America; they are chiefly made in Lancaster, and
two or three neighboring towns in that vicinity, in Pennsylvania.
The barrels weigh about six pounds two or three ounces, and carry
a ball no larger than thirty-six to the pound; at least I never saw
one of the larger caliber, and I have seen many hundreds and
hundreds. I am not going to relate any thing respecting the
American war; but to mention one instance, as a proof of most
excellent skill of an American rifleman. If any man shew me an
instance of better shooting, I will stand corrected.

Colonel, now General Tartleton, and myself, were standing a few


yards out of a wood, observing the situation of a part of the enemy
which we intended to attack. There was a rivulet in the enemy’s
front, and a mill on it, to which we stood directly with our horses’
heads fronting, observing their motions. It was an absolute plain
field between us and the mill; not so much as a single bush 17
on it. Our orderly-bugle stood behind us, about 3 yards, but
with his horse’s side to our horses’ tails. A rifleman passed over the
mill-dam, evidently observing two officers, and laid himself down
on his belly; for, in such positions, they always lie, to take a good
shot at a long distance. He took a deliberate and cool shot at my
friend, at me, and the bugle-horn man. (I have passed several
times over this ground, and ever observed it with the greatest
attention; and I can positively assert that the distance he fired
from, at us, was full four hundred yards.)

Now, observe how well this fellow shot. It was in the month of
August, and not a breath of wind was stirring. Colonel Tartleton’s
horse and mine, I am certain, were not anything like two feet
apart; for we were in close consultation, how we should attack with
our troops, which laid 300 yards in the wood, and could not be
perceived by the enemy. A rifle-ball passed between him and me;
looking directly to the mill, I observed the flash of the powder. I
said to my friend, “I think we had better move, or we shall have
two or three of these gentlemen, shortly, amusing themselves at
our expence.” The words were hardly out of my mouth, when the
bugle horn man, behind us, and directly central, jumped off his
horse, and said, “Sir, my horse is shot.” The horse staggered, fell
down, and died. He was shot directly behind the foreleg, near to
the heart, at least where the great blood-vessels lie, which lead to
the heart. He took the saddle and bridle off, went into the woods,
and got another horse. We had a number of spare horses, led by
negro lads.

The rifle had been introduced into America about 1700 when there
was considerable immigration into Pennsylvania from Switzerland and
Austria, the only part of the world at that time where it was in use. It
was then short, heavy, clumsy, and little more accurate than the
musket. From this arm the American gunsmiths evolved the long,
slender, small-bore gun (about 36 balls to the pound) which by 1750
had reached the same state of development that characterized it at
the time of the Revolution. The German Jäger rifle brought to
America during the Revolution was by no means the equal of the
American piece. It was short-barreled and took a ball of 19 to the
pound. With its large ball and small powder charge its recoil was
heavy and its accurate range but little greater than that of the
smoothbore musket. It was the same gun that had been introduced
into America in 1700.

The standard military firearm of the Revolutionary period was 18


the flintrock musket weighing about 11 pounds. Its caliber was
11 gauge, that is, it would take a lead ball of 11 to the pound. At 100
yards a good marksman might make 40 percent of hits on a target
the size of a man standing. The musket ball, fitting loosely in the
barrel, could be loaded quickly. The fact that the military musket
always was equipped with a bayonet made it the dependable weapon
for all close fighting. As was so convincingly shown on the occasions
of the futile bayonet charges of Ferguson’s regulars on Kings
Mountain, however, the bayonet was not effective if enemy lines did
not stand to take the punishment of hand-to-hand fighting.

Each Whig on Kings Mountain had been told to act as his own
captain, to yield as he found it necessary, and to take every
advantage that was presented. In short, the patriots followed the
Indian mode of attack, using the splendid cover that the timber about
the mountain afforded, and selecting a definite human target for
every ball fired. Splendid leadership and command were exercised by
the Whig officers to make for concerted action every time a crisis
arose. This coordination, plus the Kentucky rifle and the “individual
power of woodcraft, marksmanship, and sportsmanship” of each
participant in the American forces, overcame all the military training
and discipline which had been injected into his Tory troops by
Ferguson.

19
Testing the Ferguson Rifle
[3]
Modern Marksman Attains High Precision With Arm of 1776

By Dr. Alfred F. Hopkins, formerly Field Curator, Museum Division, Washington.

History records that on June 1, 1776, at Woolwich, England, Maj.


Patrick Ferguson, of the British Army, demonstrated his newly devised
breechloading flintrock rifle to the astonishment of all beholders.
Quite recently at the Washington laboratory of the Museum Division
of the National Park Service beholders likewise were astonished at
the shooting qualities of the Ferguson gun.

While it is understood that tests of this historic arm have been made
in England within late years, it is believed that in this country the
sinister crack of a Ferguson had not been heard since 1780 at the
Battle of Kings Mountain, South Carolina.

Ferguson developed his rifle from two earlier types of breechloaders,


the Hardley and the Foster, upon which it was an actual
improvement, and his gun has the distinction of being the first
breechloading arm used by organized troops of any nation. The piece
is equipped with a breechplug which passes perpendicularly through
the breech of the barrel and this, having a quick-traveling screw
thread, is lowered or raised by a single revolution of the trigger guard
acting as a lever. When the breech plug is lowered, a circular opening
is left in the top of the barrel just large enough to take a spherical
bullet. In loading, the muzzle is held downward and the ball, fitting
snugly, is dropped into the opening and permitted to roll forward to
the front of the breech chamber where it is stopped by the lands of
the rifling. No wadding or patch is used. Powder and ball rolled to
form a cartridge would prove only a hindrance and disadvantage in
loading. A charge of powder is poured directly from a flask or 20
horn into the opening behind the bullet, filling the chamber.
One complete turn of the trigger guard causes the breech plug to
rise, closing the opening and ejecting the superfluous grains of
powder. When the flashpan is primed, the piece is ready for firing. In
the third illustration of this booklet the breech mechanism and
method of loading are shown. Major Ferguson is accredited with
loading and firing six shots in one minute.

No recent check is known to have been made upon the number of


Ferguson rifles now in existence. They undoubtedly do exist, but their
number is probably small. Apparently only some 200 were made
originally and their military use ended, owing to lack of foresight,
with the American Revolution. Six specimens were listed in 1928 as
being in collections in this country and in England, of which one,
probably two, were made by Newton, of Grantham, two by Egg, of
London, and one each by Turner and Wilson, of London. These six
guns varied somewhat in minor details.

A seventh specimen, the one now possessed by the Service at Kings


Mountain National Military Park, South Carolina, bears the name of F.
Innis, Edinburgh. It is in exceptionally fine condition, showing much
of the original metal finish, and is without replacements. The piece
measures 4 feet 4¾ inches over all and weighs 7½ pounds. The
barrel, slightly belled at the muzzle and not designed to carry a
bayonet, is 37 inches long, rifled with 8 grooves, and takes a ball of
.655 caliber. The full length combed walnut stock is checkered at the
grip and has three brass thimbles and an engraved butt plate. On the
lock plate forward the hammer, within a scroll, is the name, F. INNIS,
and this, with the addition of EDINBURGH, together with the proof
mark and the view mark of the Gunmakers’ Company of London,
appears upon the barrel. The wooden ramrod is horn-tipped and at
the other end has a bullet worm enclosed within a screw cap. The
arm was intended for an officer.

21
The Centennial Monument at Kings Mountain, unveiled on the 100th
anniversary of the Battle, October 7, 1880.

The recent tests conducted indoors at the Ford’s Theater 22


Laboratory were made to determine the exact method of
loading the arm, about which there had been some question, and to
learn something of its shooting qualities. Loading was found to be
extremely easy, suggesting that with practice the record set by Major
Ferguson might be attained readily. The ball, weighing approximately
500 grains, was dropped, without patch or wad, into the breech
chamber. A charge of approximately 1½ drams of Dupont “Fg” black
powder was poured in behind it. Closure of the breech automatically
gauged the charge, superfluous grains being ejected. The same
powder, more finely ground, was used as priming. Several preliminary
shots indicated that the rifle had precision and accuracy. Then, at a
distance of 90 feet, three shots were fired in succession from a table
rest by an expert marksman. Number one came within a half-inch,
number two came within 4 inches, and number three came within
1¾ inches of a 1⅝-inch bull’s-eye.
23

View of the Kings Mountain region, taken from the eastern slope of the
battlefield ridge, looking northeastwardly toward Henry’s Knob.

24
Granite obelisk erected by the Federal Government at Kings Mountain in
1909 to commemorate the Battle.
U. S. GOVERNMENT PRINTING OFFICE: 1947

25
Footnotes
[1]
From The Regional Review, National Park Service, Region One,
Richmond, Va., Vol. III, No. 6, December 1939, pp. 25-29.

[2]
From Idem., vol. V, No. 1, July 1940, pp. 15-21.

[3]
Idem., vol. VI, Nos. 1 and 2.
National Park Service
Popular Study Series
No. 1.—Winter Encampments of the Revolution.
No. 2.—Weapons and Equipment of Early American Soldiers.
No. 3.—Wall Paper News of the Sixties.
No. 4.—Prehistoric Cultures in the Southeast.
No. 5.—Mountain Speech in the Great Smokies.
No. 6.—New Echota, Birthplace of the American Indian Press.
No. 7.—Hot Shot Furnaces.
No. 8.—Perry at Put in Bay: Echoes of the War of 1812.
No. 9.—Wharf Building of a Century and More Ago.
No. 10.—Gardens of the Colonists.
No. 11.—Robert E. Lee and Fort Pulaski.
No. 12.—Rifles and Riflemen at the Battle of Kings Mountain.
No. 13.—Rifle Making in the Great Smoky Mountains.
No. 14.—American Charcoal Making in the Era of the Cold Blast
Furnace.

You might also like