0% found this document useful (0 votes)
16 views33 pages

Introductory Statistics for Data Analysis Warren J. Ewens instant download

The document provides a comprehensive overview of the book 'Introductory Statistics for Data Analysis' by Warren J. Ewens and Katherine Brumberg, which focuses on the foundational concepts of statistics and data analysis applicable across various fields. It emphasizes the importance of understanding statistical theory, using simple examples to clarify complex concepts, and offers a dual approach for students interested in both theoretical and computational aspects. Additionally, the book includes detailed problem solutions and flowcharts to aid in understanding.

Uploaded by

atasoybyrdal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views33 pages

Introductory Statistics for Data Analysis Warren J. Ewens instant download

The document provides a comprehensive overview of the book 'Introductory Statistics for Data Analysis' by Warren J. Ewens and Katherine Brumberg, which focuses on the foundational concepts of statistics and data analysis applicable across various fields. It emphasizes the importance of understanding statistical theory, using simple examples to clarify complex concepts, and offers a dual approach for students interested in both theoretical and computational aspects. Additionally, the book includes detailed problem solutions and flowcharts to aid in understanding.

Uploaded by

atasoybyrdal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Introductory Statistics for Data Analysis Warren

J. Ewens pdf download

https://ebookmeta.com/product/introductory-statistics-for-data-
analysis-warren-j-ewens/

Download more ebook from https://ebookmeta.com


We believe these products will be a great fit for you. Click
the link to download now, or visit ebookmeta.com
to discover even more!

Statistics for Data Scientists: An Introduction to


Probability, Statistics, and Data Analysis Maurits
Kaptein

https://ebookmeta.com/product/statistics-for-data-scientists-an-
introduction-to-probability-statistics-and-data-analysis-maurits-
kaptein/

Statistics and Data Analysis for Engineers and


Scientists 1st Edition Tanvir Mustafy

https://ebookmeta.com/product/statistics-and-data-analysis-for-
engineers-and-scientists-1st-edition-tanvir-mustafy/

Introduction to Python for Econometrics, Statistics and


Data Analysis. 5th Edition Kevin Sheppard.

https://ebookmeta.com/product/introduction-to-python-for-
econometrics-statistics-and-data-analysis-5th-edition-kevin-
sheppard/

Brothers in Arms Box Set The Complete Series 1st


Edition Scott Moon

https://ebookmeta.com/product/brothers-in-arms-box-set-the-
complete-series-1st-edition-scott-moon/
I'm New Here Ian Russell-Hsieh

https://ebookmeta.com/product/im-new-here-ian-russell-hsieh/

Book Markets in Mediterranean Europe and Latin America:


Institutions and Strategies (15th-18th Centuries) 1st
Edition Montserrat Cachero

https://ebookmeta.com/product/book-markets-in-mediterranean-
europe-and-latin-america-institutions-and-strategies-15th-18th-
centuries-1st-edition-montserrat-cachero/

Inquiry Based Lessons in World History Early Humans to


Global Expansion Vol 1 Grades 7 10 1st Edition Jana
Kirchner

https://ebookmeta.com/product/inquiry-based-lessons-in-world-
history-early-humans-to-global-expansion-vol-1-grades-7-10-1st-
edition-jana-kirchner/

Wonky Inn 06 0 The Mysterious Mr Wylie 1st Edition


Jeannie Wycherley

https://ebookmeta.com/product/wonky-inn-06-0-the-mysterious-mr-
wylie-1st-edition-jeannie-wycherley/

Silent Tears : A dark Revenge Romance Sasha Rc

https://ebookmeta.com/product/silent-tears-a-dark-revenge-
romance-sasha-rc/
Introduction To Electromagnetic Theory A Modern
Perspective 1st Edition Chow Tai

https://ebookmeta.com/product/introduction-to-electromagnetic-
theory-a-modern-perspective-1st-edition-chow-tai/
Warren J. Ewens
Katherine Brumberg

Introductory
Statistics
for Data
Analysis
Introductory Statistics for Data Analysis
Warren J. Ewens • Katherine Brumberg

Introductory Statistics
for Data Analysis
Warren J. Ewens Katherine Brumberg
Department of Statistics and Data Science Department of Statistics and Data Science
University of Pennsylvania University of Pennsylvania
Philadelphia, PA, USA Philadelphia, PA, USA

ISBN 978-3-031-28188-4 ISBN 978-3-031-28189-1 (eBook)


https://doi.org/10.1007/978-3-031-28189-1

Mathematics Subject Classification: 62-01

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

We take “data analysis” to be the analysis of data to obtain information of scientific


and social value. Much of the data currently considered derive from a sample,
and the randomness in the selection of that sample means that Statistics is a key
component in data analysis, since Statistics is the science of analyzing data derived
from some random process such as sampling.
Our aim in this book is to give a precise account of introductory Statistics
theory suitable for those wishing to analyze data from a variety of fields, including
medicine, biology, economics, social sciences, the physical sciences, and engineer-
ing. However, the examples given in the book are often simple ones involving
flipping a coin, rolling a die, and so on. This is because we do not want the
complexities that arise in any given scientific field to obscure the basic principles
that we describe. We have emphasized concepts and the basics of the statistical
theory, first because they are central to any data analysis, and second because in our
teaching experience, this is what students find most difficult to understand.
This implies that we have occasionally been pedantic in presenting some
theoretical concepts. For example, we have been careful to distinguish between the
concepts of a mean and an average. The conflation of these two words, sometimes
in the same paragraph in published papers, has led in our experience to much
confusion for students. Similarly, we distinguish carefully between the concepts
of a random variable, of data, and of a parameter, using notation that helps in
making this distinction. On the other hand, we have not been pedantic in stating
the requirements, for example, needed for the Central Limit Theorem to hold.
We have followed a two-track approach in this book. A student not interested in
the computing aspects of the material can follow one track and ignore all references
to R. For a student interested in a computing approach to some parts of the material
discussed, an additional approach using R has been provided. The non-computing
part of the book is self-contained and can be read without any reference to R. All
examples and problems in this book contain small data sets so that they can be
analyzed with just a simple calculator.

v
vi Preface

We have often given detailed answers to the problems since this allows them to be
considered as instructive examples rather than as problems. We have also provided
flowcharts that help put the material discussed into perspective.
We are well aware of the practical aspects of data analysis, for example of
ensuring that the data analyzed form an unbiased representative sample of the
population of interest and that the assumptions made in the theory are justified,
and have referred to these and similar matters several times throughout the book.
However, our focus is on the basic theory, since in our experience this is sometimes
little understood, so that incorrect procedures and inappropriate assumptions are
sometimes used in data analysis.
Any errors or obscurities observed in this book will be reported at the webpage
https://kbrumberg.com/publication/textbook/ewens/. Possible errors can be reported
according to the instructions on the same webpage.

Philadelphia, PA, USA Warren J. Ewens


January, 2023 Katherine Brumberg
Contents

Part I Introduction
1 Statistics and Probability Theory .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3
1.1 What is Statistics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3
1.2 The Relation Between Probability Theory and Statistics. . . . . . . . . . . 5
1.3 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7

Part II Probability Theory


2 Events . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11
2.1 What Are Events? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11
2.2 Notation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11
2.3 Derived Events: Complements, Unions and Intersections
of Events .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11
2.4 Mutually Exclusive Events . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12
2.5 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12
3 Probabilities of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13
3.1 Probabilities of Derived Events .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 13
3.2 Independence of Two Events . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 14
3.3 Conditional Probabilities.. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 16
3.4 Conditional Probabilities and Mutually Exclusive Events . . . . . . . . . 17
3.5 Conditional Probabilities and Independence .. . .. . . . . . . . . . . . . . . . . . . . 17
3.6 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 18
4 Probability: One Discrete Random Variable . . . . . . . .. . . . . . . . . . . . . . . . . . . . 21
4.1 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 21
4.2 Random Variables and Data . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 22
4.3 The Probability Distribution of a Discrete Random Variable . . . . . . 23
4.4 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 26
4.5 The Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 26
4.6 The Hypergeometric Distribution . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 30
4.7 The Mean of a Discrete Random Variable . . . . . .. . . . . . . . . . . . . . . . . . . . 34

vii
viii Contents

4.8 The Variance of a Discrete Random Variable .. .. . . . . . . . . . . . . . . . . . . . 38


4.9 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 41
5 Many Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 47
5.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 47
5.2 Notation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 48
5.3 Independently and Identically Distributed Random Variables . . . . . 48
5.4 The Mean and Variance of a Sum and of an Average . . . . . . . . . . . . . . 49
5.5 The Mean and the Variance of a Difference.. . . .. . . . . . . . . . . . . . . . . . . . 52
5.6 The Proportion of Successes in n Binomial Trials .. . . . . . . . . . . . . . . . . 53
5.7 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 55
6 Continuous Random Variables. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 61
6.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 61
6.2 The Mean and Variance of a Continuous Random Variable.. . . . . . . 63
6.3 The Normal Distribution .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 64
6.4 The Standardization Procedure . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 67
6.5 Numbers that Are Seen Often in Statistics . . . . . .. . . . . . . . . . . . . . . . . . . . 68
6.6 Using the Normal Distribution Chart in Reverse . . . . . . . . . . . . . . . . . . . 70
6.7 Sums, Averages and Differences of Independent Normal
Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 71
6.8 The Central Limit Theorem .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 74
6.9 Approximating Discrete Random Variable Probabilities
Using the Normal Distribution . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 76
6.9.1 The Binomial Case . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 77
6.9.2 The Die Example . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 77
6.10 A Window Into Statistics . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 78
6.11 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 78

Part III Statistics


7 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 85
8 Estimation of a Parameter .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87
8.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87
8.2 Estimating the Binomial Parameter θ . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87
8.2.1 Properties of Estimates and Estimators .. . . . . . . . . . . . . . . . . . . 87
8.2.2 The Precision of the Estimate of θ . . . . .. . . . . . . . . . . . . . . . . . . . 88
8.3 Estimating the Mean μ . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91
8.3.1 The Estimate of μ . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91
8.3.2 The Precision of the Estimate of μ . . . .. . . . . . . . . . . . . . . . . . . . 91
8.4 Estimating the Difference Between Two Binomial
Parameters θ1 − θ2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 95
8.4.1 The Estimate of θ1 − θ2 . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 95
8.4.2 The Precision of the Estimate of θ1 − θ2 . . . . . . . . . . . . . . . . . . 96
8.5 Estimating the Difference Between Two Means μ1 − μ2 . . . . . . . . . . 98
8.5.1 The Estimate of μ1 − μ2 . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 99
Contents ix

8.5.2 The Precision of the Estimate of μ1 − μ2 .. . . . . . . . . . . . . . . . 99


8.6 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
8.7 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 110
9 Testing Hypotheses About the Value of a Parameter . . . . . . . . . . . . . . . . . . . 115
9.1 Introduction to Hypothesis Testing . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 116
9.2 Two Approaches to Hypothesis Testing .. . . . . . . .. . . . . . . . . . . . . . . . . . . . 116
9.2.1 Both Approaches, Step 1 . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 116
9.2.2 Both Approaches, Step 2 . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 118
9.2.3 Both Approaches, Step 3 . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 119
9.2.4 Steps 4 and 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 119
9.2.5 Approach 1, Step 4, the Medicine Example .. . . . . . . . . . . . . . 120
9.2.6 Approach 1, Step 5, the Medicine Example .. . . . . . . . . . . . . . 121
9.2.7 Approach 1, Step 4, the Coin Example .. . . . . . . . . . . . . . . . . . . 123
9.2.8 Approach 1, Step 5, the Coin Example .. . . . . . . . . . . . . . . . . . . 124
9.2.9 Approach 2 to Hypothesis Testing .. . . .. . . . . . . . . . . . . . . . . . . . 124
9.2.10 Approach 2, Step 4, the Medicine and the Coin
Examples .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 125
9.2.11 Approach 2, Step 5, the Medicine Example .. . . . . . . . . . . . . . 125
9.2.12 Approach 2, Step 5, the Coin Example .. . . . . . . . . . . . . . . . . . . 126
9.3 The Hypothesis Testing Procedure and the Concepts
of Deduction and Induction .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 127
9.4 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 129
9.5 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 133
10 Testing for the Equality of Two Binomial Parameters.. . . . . . . . . . . . . . . . . 137
10.1 Two-by-Two Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 137
10.2 Simpson’s Paradox and Fisher’s Exact Test . . . . .. . . . . . . . . . . . . . . . . . . . 144
10.3 Notes on Two-by-Two Tables. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 147
10.4 Two-Sided Two-by-Two Table Tests . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 150
10.5 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 153
11 Chi-Square Tests (i): Tables Bigger Than Two-by-Two . . . . . . . . . . . . . . . . 157
11.1 Large Contingency Tables . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 157
11.2 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 161
12 Chi-Square Tests (ii): Testing for a Specified Probability
Distribution.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 163
12.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 163
12.2 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 166
12.3 A More Complicated Situation . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 168
12.4 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 170
13 Tests on Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 173
13.1 The One-Sample t Test. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 173
13.2 The Two-Sample t Test . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 181
13.3 The Paired Two-Sample t Test. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 188
x Contents

13.4 t Tests in Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 192


13.5 General Notes on t Statistics. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 195
13.6 Exact Confidence Intervals . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 196
13.7 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 199
14 Non-parametric Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
14.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
14.2 Non-parametric Alternative to the One-Sample t Test:
The Wilcoxon Signed-Rank Test . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
14.3 Non-parametric Alternative to the Two-Sample t Test:
The Wilcoxon Rank-Sum Test . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 210
14.4 Other Non-parametric Procedures .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 213
14.5 Permutation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 213
14.5.1 The Permutation Alternative to the Signed-Rank Test . . . . 213
14.5.2 The Permutation Alternative to the Rank-Sum Test . . . . . . 214
14.6 Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 216
Useful Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 219
Solutions to Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 271
Part I
Introduction
Chapter 1
Statistics and Probability Theory

1.1 What is Statistics?

The word “Statistics” means different things to different people. For a baseball fan,
it might relate to batting averages. For an actuary, it might relate to life tables. In
this book, we mean the scientific definition of “Statistics”, which is Statistics is the
science of analyzing data in whose generation chance has played some part. This
sentence is the most important one in the entire book, and it permeates the entire
book. Statistics as we understand it via this definition has become a central area of
modern science and data analysis, as discussed below.
Why is Statistics now central to modern science and data analysis? This question
is best answered by considering the historical context. In the past, Mathematics
developed largely in association with areas of science in which chance mechanisms
were either non-existent or not important. Thus in the past a great deal of progress
was made in such areas as Physics, Engineering, Astronomy and Chemistry using
mathematical methods which did not allow any chance, or random, features in the
analysis. For example, no randomness is involved in Newton’s laws or in the theory
of relativity, both of which are entirely deterministic. It is true that quantum theory
is the prevailing paradigm in the physical sciences and that this theory intrinsically
involves randomness. However, that intrinsic level of randomness is not discussed
in this book.
Our focus is on more recently developed areas of science such as Medicine,
Biology and Psychology, in which there are various chance mechanisms at work,
and deterministic theory is no longer appropriate in these areas. In a medical clinical
trial of a proposed new medicine, the number of people cured by the new medicine
will depend on the choice of individuals in the trial: with a different group of
individuals, a different number of people cured will probably be seen. (Clinical
trials are discussed later in this book.) In areas such as Biology, there are many
random factors deriving from, for example, the random transmission of genes from
parent to offspring, implying that precise statements concerning the evolution of a

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 3


W. J. Ewens, K. Brumberg, Introductory Statistics for Data Analysis,
https://doi.org/10.1007/978-3-031-28189-1_1
4 1 Statistics and Probability Theory

population cannot be made. Similar comments arise for many other areas of modern
science and indeed in all areas where inferences are to be drawn from data in whose
generation randomness played some part.
The data in “data analysis” are almost always a sample of some kind. A different
sample would almost certainly yield different data, so that the sampling process
introduces a second chance element over and above the inherent randomness in
areas of science described above. This means that in order to make progress in these
areas, one has to know how to analyze data in whose generation chance mechanisms
were at work.
This is where Statistics becomes relevant. The role played by Mathematics in
Physics, Engineering, Astronomy and Chemistry is played by Statistics in Medicine,
Economics, Biology and many other associated areas. Statistics is fundamental to
making progress in those areas. The following examples illustrate this.
Example 1.1.1 In a study to examine the effects of sunlight exposure on the growth
of a new type of grass, grass seeds were sown in 22 identical specifically designed
containers. Grass in 11 of these containers were exposed to full sunlight during the
growing period and grass in the remaining 11 containers were exposed to 50% shade
during the growing period. At the end of the growing period, the biomass in each
container was measured and the following data (in coded units) were obtained:

Full sun: 1903, 1935, 1910, 2096, 2008, 1961, 2060, 1644, 1612, 1811, 1714
.
50% Shade: 1759, 1718, 1820, 1933, 1990, 1920, 1796, 1696, 1578, 1682, 1526
(1.1)

There are clearly several chance mechanisms determining the data values that we
observed. A different experiment would almost certainly give different data. The
data do not immediately indicate an obvious difference between the two groups,
and in order to make our assessment about a possible difference, we will have to
use statistical methods, which allow for the randomness in the data. The statistical
analysis of data of this form is discussed in Sects. 8.5 and 13.2.
Example 1.1.2 The data from the 2020 clinical trial of the proposed Moderna
COVID vaccine, in which 30,420 volunteers were divided into two groups, 15,210
being given the proposed vaccine and 15,210 being given a harmless placebo, are
given below. The data are taken from L. R. Baden et al. Efficacy and Safety of the
mRNA-1273 SARS-CoV-2 Vaccine, New England Journal of Medicine 384:403-416,
February 2021.

Did not develop COVID Did develop COVID Total


Given proposed vaccine 15,199 11 15,210
Given placebo 15,025 185 15,210
Total 30,224 196 30,420
1.2 The Relation Between Probability Theory and Statistics 5

The way in which data such as those in this table are analyzed statistically will be
described in Chap. 10. For now, we note that if this clinical trial had been carried out
on a different sample of 30,420 people, almost certainly different data would have
arisen. Again, Statistics provides a process for handling data where randomness
such as this arises.
These two examples are enough to make two important points. The first is that
because of the randomness inherent in the sampling process, no exact statements
such as those made, for example, in Physics are possible. We will have to make
statements indicating some level of uncertainty in our conclusions. It is not possible,
in analyzing data derived from a sampling process, to be .100% certain that our
conclusion is correct. This indicates a real limitation to what can be asserted in
modern science. More specific information about this lack of certainty is introduced
in Sect. 9.2.2 and then methods for handling this uncertainty are developed in later
sections.
The second point is that, because of the unpredictable random aspect in the
generation of the data arising in many areas of science, it is necessary to first
consider various aspects of probability theory in order to know what probability
calculations are needed for the statistical problem at hand. This book therefore
starts with an introduction to probability theory, with no immediate reference to
the associated statistical procedures. This implies that before discussing the details
of probability theory, we first discuss the relation between probability theory and
Statistics.

1.2 The Relation Between Probability Theory and Statistics

We start with a simple example concerning the flipping of a coin. Suppose that we
have a coin that we suspect is biased towards heads. To check on this suspicion, we
flip the coin 2000 times and observe the number of heads that we get. Even if the
coin is fair, we would not expect, beforehand, to get exactly 1000 heads from the
2000 flips. This is because of the randomness inherent in the coin-flipping operation.
However, we would expect to see approximately 1000 heads. If once we flipped the
coin we got 1373 heads, we would obviously (and reasonably) claim that we have
very good evidence that the coin is biased towards heads. The reasoning that one
goes through in coming to this conclusion is probably something like this: “if the
coin is fair, it is extremely unlikely that we would get 1373 or more heads from 2000
flips. But since we did in fact get 1373 heads, we have strong evidence that the coin
is unfair.”
Conversely, if we got 1005 heads, we would not reasonably conclude that we
have good evidence that the coin is biased towards heads. The reason for coming to
this conclusion is that, because of the randomness involved in the flipping of a coin,
a fair coin can easily give 1005 or more heads from 2000 flips, so that observing
1005 heads gives no significant evidence that the coin is unfair.
6 1 Statistics and Probability Theory

These two examples are extreme cases, and in reality we often have to deal with
more gray-area situations. If we saw 1072 heads, intuition and common sense might
not help. What we have to do is to calculate the probability of getting 1072 or more
heads if the coin is fair. Probability theory calculations (which we will do later) show
that the probability of getting 1072 or more heads from 2000 flips of a fair coin is
very low (about 0.0006). This probability calculation is a deduction, or implication.
It is very unlikely that a fair coin would turn up heads 1072 times or more from
2000 flips. From this fact and the fact that we did see 1072 heads on the 2000 flips
of the coin, we make the statistical induction, or inference, that we can reasonably
conclude that we have significant evidence that the coin is biased.
The logic is as follows. Either the coin is fair and something very unlikely has
happened (probability about 0.0006) or the coin is not fair. We prefer to believe the
second possibility. We do not like to entertain a hypothesis that does not reasonably
explain what we saw in practice. This argument follows the procedures of modern
science.
In coming to the opinion that the coin is unfair we could be incorrect: the coin
might have been fair and something very unlikely might have happened (1072
heads). We have to accept this possibility when using Statistics: we cannot be certain
that any conclusion, that is, any statistical induction or inference, that we reach is
correct. This problem is discussed in detail later in this book.
To summarize: probability theory makes deductions, or implications. Statistics
makes inductions, or inferences. Each induction, or inference, is always based both
on data and the corresponding probability theory calculation relating to those data.
This induction might be incorrect because it is based on data in whose generation
randomness was involved.
In the coin example above, the statistical induction, or inference, that we made
(that we believe we have good evidence that the coin is unfair, given that there
were 1072 heads in the 2000 flips) was based entirely on the probability calculation
leading to the value 0.0006. In general, no statistical inference can be made
without first making the relevant probability theory calculation. This is one reason
why people often find Statistics difficult. In doing Statistics, we have to consider
aspects of probability theory, and unfortunately our intuition concerning probability
calculations is often incorrect.
Here is a more important example. Suppose that we are using some medicine
(the “current” medicine) to cure some illness. From experience we know that, for
any person having this illness, the probability that this current medicine cures any
patient is 0.8. A new medicine is proposed as being better than the current one. To
test whether this claim is justified, we plan to conduct a clinical trial in which the
new medicine will be given to 2000 people suffering from the disease in question.
If the new medicine is equally effective as the current one, we would, beforehand,
expect it to cure about 1600 of these people. Suppose that after the clinical trial is
conducted, the proposed new medicine cured 1643 people. Is this significantly more
than 1600? Calculations that we will do later show that the probability that we would
get 1643 or more people cured with the new medicine if it is equally effective as the
current medicine is about 0.009, or a bit less than 0.01. Thus if the new medicine did
Another Random Scribd Document
with Unrelated Content
*** END OF THE PROJECT GUTENBERG EBOOK SECRET SERVICE
UNDER PITT ***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States
copyright in these works, so the Foundation (and you!) can copy
and distribute it in the United States without permission and
without paying copyright royalties. Special rules, set forth in the
General Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree
to abide by all the terms of this agreement, you must cease
using and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project Gutenberg™
works in compliance with the terms of this agreement for
keeping the Project Gutenberg™ name associated with the
work. You can easily comply with the terms of this agreement
by keeping this work in the same format with its attached full
Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and
with almost no restrictions whatsoever. You may copy it,
give it away or re-use it under the terms of the Project
Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country
where you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of
the copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute


this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except


for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission


of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About


Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like