100% found this document useful (8 votes)
36 views

PDF (Ebook) Data Science and Machine Learning: Mathematical and Statistical Methods (Chapman & Hall/Crc Machine Learning & Pattern Recognition) by Dirk P. Kroese, Zdravko Botev, Thomas Taimre, Radislav Vaisman ISBN 9781138492530, 1138492531 download

The document provides information about various ebooks related to data science, machine learning, and statistical methods, including titles by authors like Dirk P. Kroese and Simon Rogers. It includes links to download these ebooks in multiple formats and outlines the contents of a specific book titled 'Data Science and Machine Learning: Mathematical and Statistical Methods.' Additionally, it mentions the importance of copyright and provides details on obtaining permissions for reproducing content.

Uploaded by

ahouziaych
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (8 votes)
36 views

PDF (Ebook) Data Science and Machine Learning: Mathematical and Statistical Methods (Chapman & Hall/Crc Machine Learning & Pattern Recognition) by Dirk P. Kroese, Zdravko Botev, Thomas Taimre, Radislav Vaisman ISBN 9781138492530, 1138492531 download

The document provides information about various ebooks related to data science, machine learning, and statistical methods, including titles by authors like Dirk P. Kroese and Simon Rogers. It includes links to download these ebooks in multiple formats and outlines the contents of a specific book titled 'Data Science and Machine Learning: Mathematical and Statistical Methods.' Additionally, it mentions the importance of copyright and provides details on obtaining permissions for reproducing content.

Uploaded by

ahouziaych
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Download Full Version ebook - Visit ebooknice.

com

(Ebook) Data Science and Machine Learning:


Mathematical and Statistical Methods (Chapman &
Hall/Crc Machine Learning & Pattern Recognition)
by Dirk P. Kroese, Zdravko Botev, Thomas Taimre,
Radislav Vaisman ISBN 9781138492530, 1138492531
https://ebooknice.com/product/data-science-and-machine-
learning-mathematical-and-statistical-methods-chapman-hall-
crc-machine-learning-pattern-recognition-11158870

Click the button below to download

DOWLOAD EBOOK

Discover More Ebook - Explore Now at ebooknice.com


Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...

Start reading on any device today!

(Ebook) Handbook of Monte Carlo Methods by Dirk P. Kroese,


Thomas Taimre, Zdravko I. Botev ISBN 9780470177938,
0470177934, B00D9NT5B0
https://ebooknice.com/product/handbook-of-monte-carlo-methods-2263792

ebooknice.com

(Ebook) Biota Grow 2C gather 2C cook by Loucas, Jason;


Viles, James ISBN 9781459699816, 9781743365571,
9781925268492, 1459699815, 1743365578, 1925268497
https://ebooknice.com/product/biota-grow-2c-gather-2c-cook-6661374

ebooknice.com

(Ebook) Simulation and the Monte Carlo Method: Solutions


Manual by Dirk P. Kroese, Thomas Taimre, Zdravko I. Botev,
Rueven Y. Rubinstein ISBN 9780470258798, 0470258799
https://ebooknice.com/product/simulation-and-the-monte-carlo-method-
solutions-manual-6537554

ebooknice.com

(Ebook) Simulation and the Monte Carlo Method: Solutions


Manual by Dirk P. Kroese, Thomas Taimre, Zdravko I. Botev,
Rueven Y. Rubinstein ISBN 9780470258798, 9780470285312,
0470258799, 0470285311
https://ebooknice.com/product/simulation-and-the-monte-carlo-method-
solutions-manual-4312080

ebooknice.com
(Ebook) SAT II Success MATH 1C and 2C 2002 (Peterson's SAT
II Success) by Peterson's ISBN 9780768906677, 0768906679

https://ebooknice.com/product/sat-ii-success-
math-1c-and-2c-2002-peterson-s-sat-ii-success-1722018

ebooknice.com

(Ebook) Matematik 5000+ Kurs 2c Lärobok by Lena


Alfredsson, Hans Heikne, Sanna Bodemyr ISBN 9789127456600,
9127456609
https://ebooknice.com/product/matematik-5000-kurs-2c-larobok-23848312

ebooknice.com

(Ebook) A First Course in Machine Learning (Chapman &


Hall/CRC Machine Learning & Pattern Recognition) by Simon
Rogers, Mark Girolami ISBN 9781498738484, 1498738486
https://ebooknice.com/product/a-first-course-in-machine-learning-
chapman-hall-crc-machine-learning-pattern-recognition-38476138

ebooknice.com

(Ebook) Master SAT II Math 1c and 2c 4th ed (Arco Master


the SAT Subject Test: Math Levels 1 & 2) by Arco ISBN
9780768923049, 0768923042
https://ebooknice.com/product/master-sat-ii-math-1c-and-2c-4th-ed-
arco-master-the-sat-subject-test-math-levels-1-2-2326094

ebooknice.com

(Ebook) Cambridge IGCSE and O Level History Workbook 2C -


Depth Study: the United States, 1919-41 2nd Edition by
Benjamin Harrison ISBN 9781398375147, 9781398375048,
1398375144, 1398375047
https://ebooknice.com/product/cambridge-igcse-and-o-level-history-
workbook-2c-depth-study-the-united-states-1919-41-2nd-edition-53538044

ebooknice.com
Data Science and Machine Learning
Mathematical and Statistical Methods
Chapman & Hall/CRC Machine Learning & Pattern
Recognition

Introduction to Machine Learning with Applications in Information


Security
Mark Stamp

A First Course in Machine Learning


Simon Rogers, Mark Girolami

Statistical Reinforcement Learning: Modern Machine Learning


Approaches
Masashi Sugiyama

Sparse Modeling: Theory, Algorithms, and Applications


Irina Rish, Genady Grabarnik

Computational Trust Models and Machine Learning


Xin Liu, Anwitaman Datta, Ee-Peng Lim

Regularization, Optimization, Kernels, and Support Vector Machines


Johan A.K. Suykens, Marco Signoretto, Andreas Argyriou

Machine Learning: An Algorithmic Perspective, Second Edition


Stephen Marsland

Bayesian Programming
Pierre Bessiere, Emmanuel Mazer, Juan Manuel Ahuactzin, Kamel
Mekhnacha

Multilinear Subspace Learning: Dimensionality Reduction of


Multidimensional Data
Haiping Lu, Konstantinos N. Plataniotis, Anastasios Venetsanopoulos
Data Science and Machine Learning: Mathematical and Statistical
Methods
Dirk P. Kroese, Zdravko I. Botev, Thomas Taimre, Radislav Vaisman

For more information on this series please visit: https://www.crcpress.com/Chapman--


HallCRC-Machine-Learning--Pattern-Recognition/book-series/erie
Data Science and Machine Learning
Mathematical and Statistical Methods

Dirk P. Kroese
Zdravko I. Botev
Thomas Taimre
Radislav Vaisman
Front cover image reproduced with permission from J. A. Kroese.

CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2020 by Taylor & Francis Group, LLC


CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper

International Standard Book Number-13: 978-1-138-49253-0 (Hardback)

This book contains information obtained from authentic and highly regarded sources.
Reasonable efforts have been made to publish reliable data and information, but the author
and publisher cannot assume responsibility for the validity of all materials or the
consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if
permission to publish in this form has not been obtained. If any copyright material has not
been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted,
reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other
means, now known or hereafter invented, including photocopying, microfilming, and
recording, or in any information storage or retrieval system, without written permission
from the publishers.

For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance
Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a
not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of
payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered


trademarks, and are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com

and the CRC Press Web site at


http://www.crcpress.com
To my wife and daughters: Lesley, Elise, and Jessica
— DPK

To Sarah, Sofia, and my parents


— ZIB

To my grandparents: Arno, Harry, Juta, and Maila


— TT

To Valerie
— RV
CONTENTS

Preface

Notation

1 Importing, Summarizing, and Visualizing Data


1.1 Introduction
1.2 Structuring Features According to Type
1.3 Summary Tables
1.4 Summary Statistics
1.5 Visualizing Data
1.5.1 Plotting Qualitative Variables
1.5.2 Plotting Quantitative Variables
1.5.3 Data Visualization in a Bivariate Setting
Exercises

2 Statistical Learning
2.1 Introduction
2.2 Supervised and Unsupervised Learning
2.3 Training and Test Loss
2.4 Tradeoffs in Statistical Learning
2.5 Estimating Risk
2.5.1 In-Sample Risk
2.5.2 Cross-Validation
2.6 Modeling Data
2.7 Multivariate Normal Models
2.8 Normal Linear Models
2.9 Bayesian Learning
Exercises

3 Monte Carlo Methods


3.1 Introduction
3.2 Monte Carlo Sampling
3.2.1 Generating Random Numbers
3.2.2 Simulating Random Variables
3.2.3 Simulating Random Vectors and Processes
3.2.4 Resampling
3.2.5 Markov Chain Monte Carlo
3.3 Monte Carlo Estimation
3.3.1 Crude Monte Carlo
3.3.2 Bootstrap Method
3.3.3 Variance Reduction
3.4 Monte Carlo for Optimization
3.4.1 Simulated Annealing
3.4.2 Cross-Entropy Method
3.4.3 Splitting for Optimization
3.4.4 Noisy Optimization
Exercises

4 Unsupervised Learning
4.1 Introduction
4.2 Risk and Loss in Unsupervised Learning
4.3 Expectation–Maximization (EM) Algorithm
4.4 Empirical Distribution and Density Estimation
4.5 Clustering via Mixture Models
4.5.1 Mixture Models
4.5.2 EM Algorithm for Mixture Models
4.6 Clustering via Vector Quantization
4.6.1 K-Means
4.6.2 Clustering via Continuous Multiextremal Optimization
4.7 Hierarchical Clustering
4.8 Principal Component Analysis (PCA)
4.8.1 Motivation: Principal Axes of an Ellipsoid
4.8.2 PCA and Singular Value Decomposition (SVD)
Exercises

5 Regression
5.1 Introduction
5.2 Linear Regression
5.3 Analysis via Linear Models
5.3.1 Parameter Estimation
5.3.2 Model Selection and Prediction
5.3.3 Cross-Validation and Predictive Residual Sum of Squares
5.3.4 In-Sample Risk and Akaike Information Criterion
5.3.5 Categorical Features
5.3.6 Nested Models
5.3.7 Coefficient of Determination
5.4 Inference for Normal Linear Models
5.4.1 Comparing Two Normal Linear Models
5.4.2 Confidence and Prediction Intervals
5.5 Nonlinear Regression Models
5.6 Linear Models in Python
5.6.1 Modeling
5.6.2 Analysis
5.6.3 Analysis of Variance (ANOVA)
5.6.4 Confidence and Prediction Intervals
5.6.5 Model Validation
5.6.6 Variable Selection
5.7 Generalized Linear Models
Exercises

6 Regularization and Kernel Methods


6.1 Introduction
6.2 Regularization
6.3 Reproducing Kernel Hilbert Spaces
6.4 Construction of Reproducing Kernels
6.4.1 Reproducing Kernels via Feature Mapping
6.4.2 Kernels from Characteristic Functions
6.4.3 Reproducing Kernels Using Orthonormal Features
6.4.4 Kernels from Kernels
6.5 Representer Theorem
6.6 Smoothing Cubic Splines
6.7 Gaussian Process Regression
6.8 Kernel PCA
Exercises

7 Classification
7.1 Introduction
7.2 Classification Metrics
7.3 Classification via Bayes’ Rule
7.4 Linear and Quadratic Discriminant Analysis
7.5 Logistic Regression and Softmax Classification
7.6 K-Nearest Neighbors Classification
7.7 Support Vector Machine
7.8 Classification with Scikit-Learn
Exercises

8 Decision Trees and Ensemble Methods


8.1 Introduction
8.2 Top-Down Construction of Decision Trees
8.2.1 Regional Prediction Functions
8.2.2 Splitting Rules
8.2.3 Termination Criterion
8.2.4 Basic Implementation
8.3 Additional Considerations
8.3.1 Binary Versus Non-Binary Trees
8.3.2 Data Preprocessing
8.3.3 Alternative Splitting Rules
8.3.4 Categorical Variables
8.3.5 Missing Values
8.4 Controlling the Tree Shape
8.4.1 Cost-Complexity Pruning
8.4.2 Advantages and Limitations of Decision Trees
8.5 Bootstrap Aggregation
8.6 Random Forests
8.7 Boosting
Exercises

9 Deep Learning
9.1 Introduction
9.2 Feed-Forward Neural Networks
9.3 Back-Propagation
9.4 Methods for Training
9.4.1 Steepest Descent
9.4.2 Levenberg–Marquardt Method
9.4.3 Limited-Memory BFGS Method
9.4.4 Adaptive Gradient Methods
9.5 Examples in Python
9.5.1 Simple Polynomial Regression
9.5.2 Image Classification
Exercises

A Linear Algebra and Functional Analysis


A.1 Vector Spaces, Bases, and Matrices
A.2 Inner Product
A.3 Complex Vectors and Matrices
A.4 Orthogonal Projections
A.5 Eigenvalues and Eigenvectors
A.5.1 Left- and Right-Eigenvectors
A.6 Matrix Decompositions
A.6.1 (P)LU Decomposition
A.6.2 Woodbury Identity
A.6.3 Cholesky Decomposition
A.6.4 QR Decomposition and the Gram–Schmidt Procedure .
A.6.5 Singular Value Decomposition
A.6.6 Solving Structured Matrix Equations
A.7 Functional Analysis
A.8 Fourier Transforms
A.8.1 Discrete Fourier Transform
A.8.2 Fast Fourier Transform

B Multivariate Differentiation and Optimization


B.1 Multivariate Differentiation
B.1.1 Taylor Expansion
B.1.2 Chain Rule
B.2 Optimization Theory
B.2.1 Convexity and Optimization
B.2.2 Lagrangian Method
B.2.3 Duality
B.3 Numerical Root-Finding and Minimization
B.3.1 Newton-Like Methods
B.3.2 Quasi-Newton Methods
B.3.3 Normal Approximation Method
B.3.4 Nonlinear Least Squares
B.4 Constrained Minimization via Penalty Functions

C Probability and Statistics


C.1 Random Experiments and Probability Spaces
C.2 Random Variables and Probability Distributions
C.3 Expectation
C.4 Joint Distributions
C.5 Conditioning and Independence
C.5.1 Conditional Probability
C.5.2 Independence
C.5.3 Expectation and Covariance
C.5.4 Conditional Density and Conditional Expectation
C.6 Functions of Random Variables
C.7 Multivariate Normal Distribution
C.8 Convergence of Random Variables
C.9 Law of Large Numbers and Central Limit Theorem
C.10 Markov Chains
C.11 Statistics
C.12 Estimation
C.12.1 Method of Moments
C.12.2 Maximum Likelihood Method
C.13 Confidence Intervals
C.14 Hypothesis Testing

D Python Primer
D.1 Getting Started
D.2 Python Objects
D.3 Types and Operators
D.4 Functions and Methods
D.5 Modules
D.6 Flow Control
D.7 Iteration
D.8 Classes
D.9 Files
D.10 NumPy
D.10.1 Creating and Shaping Arrays
D.10.2 Slicing
D.10.3 Array Operations
D.10.4 Random Numbers
D.11 Matplotlib
D.11.1 Creating a Basic Plot
D.12 Pandas
D.12.1 Series and DataFrame
D.12.2 Manipulating Data Frames
D.12.3 Extracting Information
D.12.4 Plotting
D.13 Scikit-learn
D.13.1 Partitioning the Data
D.13.2 Standardization
D.13.3 Fitting and Prediction
D.13.4 Testing the Model
D.14 System Calls, URL Access, and Speed-Up

Bibliography

Index
PREFACE

In our present world of automation, cloud computing, algorithms, artificial


intelligence, and big data, few topics are as relevant as data science and
machine learning. Their recent popularity lies not only in their applicability
to real-life questions, but also in their natural blending of many different
disciplines, including mathematics, statistics, computer science,
engineering, science, and finance.
To someone starting to learn these topics, the multitude of computational
techniques and mathematical ideas may seem overwhelming. Some may be
satisfied with only learning how to use off-the-shelf recipes to apply to
practical situations. But what if the assumptions of the black-box recipe are
violated? Can we still trust the results? How should the algorithm be
adapted? To be able to truly understand data science and machine learning it
is important to appreciate the underlying mathematics and statistics, as well
as the resulting algorithms.
The purpose of this book is to provide an accessible, yet comprehensive,
account of data science and machine learning. It is intended for anyone
interested in gaining a better understanding of the mathematics and statistics
that underpin the rich variety of ideas and machine learning algorithms in
data science. Our viewpoint is that computer languages come and go, but
the underlying key ideas and algorithms will remain forever and will form
the basis for future developments.
Before we turn to a description of the topics in this book, we would like
to say a few words about its philosophy. This book resulted from various
courses in data science and machine learning at the Universities of
Queensland and New South Wales, Australia. When we taught these
courses, we noticed that students were eager to learn not only how to apply
algorithms but also to understand how these algorithms actually work.
However, many existing textbooks assumed either too much background
knowledge (e.g., measure theory and functional analysis) or too little
(everything is a black box), and the information overload from often
disjointed and contradictory internet sources made it more difficult for
students to gradually build up their knowledge and understanding. We
therefore wanted to write a book about data science and machine learning
that can be read as a linear story, with a substantial “backstory” in the
appendices. The main narrative starts very simply and builds up gradually to
quite an advanced level. The backstory contains all the necessary
background, as well as additional information, from linear algebra and
functional analysis (Appendix A), multivariate differentiation and
optimization (Appendix B), and probability and statistics (Appendix C).
Moreover, to make the abstract ideas come alive, we believe it is important
that the reader sees actual implementations of the algorithms, directly
translated from the theory. After some deliberation we have chosen Python
as our programming language. It is freely available and has been adopted as
the programming language of choice for many practitioners in data science
and machine learning. It has many useful packages for data manipulation
(often ported from R) and has been designed to be easy to program. A
gentle introduction to Python is given in Appendix D.
KEYWORDS

To keep the book manageable in size we had to be selective in our choice


of topics. Important ideas and connections between various concepts are
highlighted keywords via and page references (indicated by a ☞) in the
margin. Key definitions and theorems are highlighted in boxes. Whenever
feasible we provide proofs of theorems. Finally, we place great importance
on notation. It is often the case that once a consistent and concise system of
notation is in place, seemingly difficult ideas suddenly become obvious. We
use different fonts to distinguish between different types of objects. Vectors
are denoted by letters in boldface italics, x, X, and matrices by uppercase
letters in boldface roman font, A, K. We also distinguish between random
vectors and their values by using upper and lower case letters, e.g., X
(random vector) and x (its value or outcome). Sets are usually denoted by
calligraphic letters G , H . The symbols for probability and expectation are
ℙ and E , respectively. Distributions are indicated by sans serif font, as in
Bin and Gamma; exceptions are the ubiquitous notations N and U for the
normal and uniform distributions. A summary of the most important
symbols and abbreviations is given on Pages xvii–xxi.
☞ xvii
Data science provides the language and techniques necessary for
understanding and dealing with data. It involves the design, collection,
analysis, and interpretation of numerical data, with the aim of extracting
patterns and other useful information. Machine learning, which is closely
related to data science, deals with the design of algorithms and computer
resources to learn from data. The organization of the book follows roughly
the typical steps in a data science project: Gathering data to gain
information about a research question; cleaning, summarization, and
visualization of the data; modeling and analysis of the data; translating
decisions about the model into decisions and predictions about the research
question. As this is a mathematics and statistics oriented book, most
emphasis will be on modeling and analysis.
We start in Chapter 1 with the reading, structuring, summarization, and
visualization of data using the data manipulation package pandas in Python.
Although the material covered in this chapter requires no mathematical
knowledge, it forms an obvious starting point for data science: to better
understand the nature of the available data. In Chapter 2, we introduce the
main ingredients of statistical learning. We distinguish between supervised
and unsupervised learning techniques, and discuss how we can assess the
predictive performance of (un)supervised learning methods. An important
part of statistical learning is the modeling of data. We introduce various
useful models in data science including linear, multivariate Gaussian, and
Bayesian models. Many algorithms in machine learning and data science
make use of Monte Carlo techniques, which is the topic of Chapter 3.
Monte Carlo can be used for simulation, estimation, and optimization.
Chapter 4 is concerned with unsupervised learning, where we discuss
techniques such as density estimation, clustering, and principal component
analysis. We then turn our attention to supervised learning in Chapter 5, and
explain the ideas behind a broad class of regression models. Therein, we
also describe how Python’s statsmodels package can be used to define and
analyze linear models. Chapter 6 builds upon the previous regression
chapter by developing the powerful concepts of kernel methods and
regularization, which allow the fundamental ideas of Chapter 5 to be
expanded in an elegant way, using the theory of reproducing kernel Hilbert
spaces. In Chapter 7, we proceed with the classification task, which also
belongs to the supervised learning framework, and consider various
methods for classification, including Bayes classification, linear and
quadratic discriminant analysis, K-nearest neighbors, and support vector
machines. In Chapter 8 we consider versatile methods for regression and
classification that make use of tree structures. Finally, in Chapter 9, we
consider the workings of neural networks and deep learning, and show that
these learning algorithms have a simple mathematical interpretation. An
extensive range of exercises is provided at the end of each chapter.

Python code and data sets for each chapter can be downloaded from the
GitHub site: https://github.com/DSML-book

Acknowledgments
Some of the Python code for Chapters 1 and 5 was adapted from [73]. We
thank Benoit Liquet for making this available, and Lauren Jones for
translating the R code into Python.
We thank all who through their comments, feedback, and suggestions
have contributed to this book, including Qibin Duan, Luke Taylor, Rémi
Mouzayek, Harry Goodman, Bryce Stansfield, Ryan Tongs, Dillon Steyl,
Bill Rudd, Nan Ye, Christian Hirsch, Chris van der Heide, Sarat Moka,
Aapeli Vuorinen, Joshua Ross, Giang Nguyen, and the anonymous referees.
David Grubbs deserves a special accollade for his professionalism and
attention to detail in his role as Editor for this book.
The book was test-run during the 2019 Summer School of the Australian
Mathematical Sciences Institute. More than 80 bright upper-undergraduate
(Honours) students used the book for the course Mathematical Methods for
Machine Learning, taught by Zdravko Botev. We are grateful for the
valuable feedback that they provided.
Our special thanks go out to Robert Salomone, Liam Berry, Robin
Carrick, and Sam Daley, who commented in great detail on earlier versions
of the entire book and wrote and improved our Python code. Their
enthusiasm, perceptiveness, and kind assistance have been invaluable.
Of course, none of this work would have been possible without the loving
support, patience, and encouragement from our families, and we thank them
with all our hearts.
This book was financially supported by the Australian Research Council
Centre of Excellence for Mathematical & Statistical Frontiers, under grant
number CE140100049.

Dirk Kroese, Zdravko Botev,


Thomas Taimre, and Radislav Vaisman
Brisbane and Sydney
NOTATION

We could, of course, use any notation we want; do not laugh at


notations; invent them, they are powerful. In fact, mathematics
is, to a large extent, invention of better notations.
Richard P. Feynman

We have tried to use a notation system that is, in order of importance,


simple, descriptive, consistent, and compatible with historical choices.
Achieving all of these goals all of the time would be impossible, but we
hope that our notation helps to quickly recognize the type or “flavor” of
certain mathematical objects (vectors, matrices, random vectors, probability
measures, etc.) and clarify intricate ideas.
We make use of various typographical aids, and it will be beneficial for
the reader to be aware of some of these.

• Boldface font is used to indicate composite objects, such as column


vectors x = [x1,…, xn]⊤ and matrices X = [xij]. Note also the difference
between the upright bold font for matrices and the slanted bold font for
vectors.

• Random variables are generally specified with upper case roman letters
X, Y, Z and their outcomes with lower case letters x, y, z. Random
vectors are thus denoted in upper case slanted bold font: X = [X1,…,
X ]⊤.
n
• Sets of vectors are generally written in calligraphic font, such as χ, but
the set of real numbers uses the common blackboard bold font ℝ.
Expectation and probability also use the latter font.

• Probability distributions use a sans serif font, such as Bin and Gamma.
Exceptions to this rule are the “standard” notations N and U for the
normal and uniform distributions.

• We often omit brackets when it is clear what the argument is of a


function or operator. For example, we prefer EX to E [X ].
2 2

• We employ color to emphasize that certain words refer to a dataset,


function, or package in Python. All code is written in typewriter font.
To be compatible with past notation choices, we introduced a special
blue symbol X for the model (design) matrix of a linear model.

• Important notation such as T g, g* is often defined in a mnemonic


,
way, such as T for “training”, g for “guess”, g* for the “star” (that is,
optimal) guess, and ℓ for “loss”.

• We will occasionally use a Bayesian notation convention in which the


same symbol is used to denote different (conditional) probability
densities. In particular, instead of writing fX(x) and fX | Y(x | y) for the
probability density function (pdf) of X and the conditional pdf of X
given Y, we simply write f (x) and f (x | y). This particular style of
notation can be of great descriptive value, despite its apparent
ambiguity.

General font/notation rules


x scalar
x vector
X random vector
X matrix
χ set
x̂ estimate or approximation
x* optimal
x
¯ average

Common mathematical symbols


∀ for all
∃ there exists
∝ is proportional to
⊥ is perpendicular to
~ is distributed as
iid ,~ are independent and identically distributed as
˜
iid
approx. is approximately distributed as
~

∇f gradient of f
∇ 2f Hessian of f
f ∈ Cp f has continuous derivatives of order p
≈ is approximately
≃ is asymptotically
≪ is much smaller than
⊕ direct sum
⊙ elementwise product
∩ intersection
∪ union
≔,=: is defined as
a.s.
converges almost surely to


d
converges in distribution to

P
converges in probability to

Lp
converges in Lp-norm to

║·║ Euclidean norm


⌈ x⌉ smallest integer larger than x
⎿ x⏌ largest integer smaller than x
x+ max{ x, 0}

Matrix/vector notation
A⊤,x⊤ transpose of matrix A or vector x
A-1 inverse of matrix A
A+ pseudo-inverse of matrix A
A-⊤ inverse of matrix A⊤ or transpose of A-1
A≻0 matrix A is positive definite
A ≥ 0 matrix A is positive semidefinite
dim( x) dimension of vector x
det(A) determinant of matrix A
|A| absolute value of the determinant of matrix A
tr(A) trace of matrix A

Reserved letters and words


ℂ set of complex numbers
d differential symbol
E expectation
e the number 2.71828 …
f probability density (discrete or continuous)
g prediction function
𝟙{A} or 𝟙 indicator function of set A
A
i the square root of –1
ℓ risk: expected loss
Loss loss function
ln (natural) logarithm
ℕ set of natural numbers {0,1,…}
O big-O order symbol: f (x) = O (g (x)) if | f (x)| ⩽ αg(x) for
some constant α as x → a
o little-o order symbol: f (x) = o(g(x)) if f (x)/g(x) → 0 as x → a
ℙ probability measure
π the number 3.14159 …
ℝ set of real numbers (one-dimensional Euclidean space)
ℝn n-dimensional Euclidean space
ℝ+ positive real line: [0, ∞)
τ deterministic training set
T random training set
X model (design) matrix
ℤ set of integers {…, –1, 0, 1,…}

Probability distributions
Ber Bernoulli
Beta beta
Bin binomial
Exp exponential
Geom geometric
Gamma gamma
F Fisher–Snedecor F
N normal or Gaussian
Pareto Pareto
Poi Poisson
t Student’s t
U uniform

Abbreviations and acronyms


cdf cumulative distribution function
CMC crude Monte Carlo
CE cross-entropy
EM expectation–maximization
GP Gaussian process
KDE Kernel density estimate/estimator
KL Kullback–Leibler
KKT Karush–Kuhn–Tucker
iid independent and identically distributed
MAP maximum a posteriori
MCMC Markov chain Monte Carlo
MLE maximum likelihood estimator/estimate
OOB out-of-bag
PCA principal component analysis
pdf probability density function (discrete or continuous)
SVD singular value decomposition
CHAPTER 1

IMPORTING, SUMMARIZING, AND


VISUALIZING DATA

This chapter describes where to find useful data sets, how to load
them into Python, and how to (re)structure the data. We also discuss
various ways in which the data can be summarized via tables and
figures. Which type of plots and numerical summaries are appropriate
depends on the type of the variable(s) in play. Readers unfamiliar with
Python are advised to read Appendix D first.

1.1 Introduction
Data comes in many shapes and forms, but can generally be thought of as
being the result of some random experiment — an experiment whose
outcome cannot be determined in advance, but whose workings are still
subject to analysis. Data from a random experiment are often stored in a
table or spreadsheet. A statistical convention is to denote variables — often
called features — as columns and the individual items (or units) as rows. It
is useful to think of three types of columns in such a spreadsheet:
FEATURES

1. The first column is usually an identifier or index column, where each


unit/row is given a unique name or ID.
2. Certain columns (features) can correspond to the design of the
experiment, specifying, for example, to which experimental group the
unit belongs. Often the entries in these columns are deterministic; that
is, they stay the same if the experiment were to be repeated.

3. Other columns represent the observed measurements of the


experiment. Usually, these measurements exhibit variability; that is,
they would change if the experiment were to be repeated.

There are many data sets available from the Internet and in software
packages. A well-known repository of data sets is the Machine Learning
Repository maintained by the University of California at Irvine (UCI),
found at https://archive.ics.uci.edu/.
These data sets are typically stored in a CSV (comma separated values)
format, which can be easily read into Python. For example, to access the
abalone data set from this website with Python, download the file to your
working directory, import the pandas package via
import pandas as pd

and read in the data as follows:


abalone = pd.read_csv('abalone.data'.header = None)

It is important to add header = None, as this lets Python know that the first
line of the CSV does not contain the names of the features, as it assumes so
by default. The data set was originally used to predict the age of abalone
from physical measurements, such as shell weight and diameter.
Another useful repository of over 1000 data sets from various packages
in the R programming language, collected by Vincent Arel-Bundock, can be
found at:

https://vincentarelbundock.github.io/Rdatasets/datasets.html.

For example, to read Fisher’s famous iris data set from R’s datasets
package into Python, type:
Exploring the Variety of Random
Documents with Different Content
In this he undoubtedly told the truth. For what we thought was a
spy was just a silly old soap peddler, who had gotten the idea
somehow that his dead brother had hidden a lot of money in the
stone wall of his mill. No doubt Mr. Ricks misplaced the roll of dress
patterns on the train. He’s pretty good at misplacing things! Aunt
Polly says that he would misplace his head if it wasn’t fastened to
him.

Dad says that big companies do business on the square. And Dad
knows.

We called on Mrs. Crandon the following day. And when we had told
her about our adventure she showed us her pile of soap. Twenty-
four cakes!

“Did he try to make you pay for it?”

“No. The first thing I knew he was gone.”

Scoop grinned.

“This ought to be enough soap to keep you beautiful for the next
fifty years.”

“Yes,” returned Mrs. Crandon, “I heard how it beautified Miss


Prindle,” and she looked at me and smiled.

Dog-gone! I felt pretty cheap. For everybody in town knew the joke.
The woman I had seen on Miss Prindle’s porch was her out-of-town
[226]sister. And Red’s beauty was all put on with cold cream and face
powder. He had his mother fix him up to fool me.

The Strickers, of course, had made up the fake beauty letter.


“Anyway,” laughed Mrs. Crandon, “the soap is good soap, whether it
makes people beautiful or not. It has such a good smell that the
baby bit into a cake yesterday afternoon, thinking it was candy, I
suppose, and I was up half the night with her.”

“If the baby has warts on the inside of her stomach,” grinned Scoop,
“she’s cured for life. For Bubbles of Beauty is death on warts. If you
think I’m stringing you, ask Jerry. The soap cured the wart that Mrs.
Pederson put on the top of his head with a broom.”

“If you don’t dry up,” I waggled, “I’ll put a wart on your head.”

But he knew I said it in fun, for I was grinning.

THE END

[227]
[Contents]
BOOKS BY LEO EDWARDS

Illustrated. Every volume complete in itself.

Hundreds of thousands of boys and girls have laughed


until their sides ached over the weird and wonderful
adventures of Jerry Todd and Poppy Ott and their
friends. Mr. Edwards’ boy characters are real. They do
the things other boys like. Pirates! Mystery! Detectives!
Adventure! Ghosts! Buried Treasure! Achievement!
Stories of boys making things, doing things, going places
—always on the jump and always having fun. His stories
are for boys and girls of all ages.

THE JERRY TODD BOOKS

JERRY TODD AND THE WHISPERING MUMMY


JERRY TODD AND THE ROSE-COLORED CAT
JERRY TODD AND THE OAK ISLAND TREASURE
JERRY TODD AND THE WALTZING HEN
JERRY TODD AND THE TALKING FROG
JERRY TODD AND THE PURRING EGG
JERRY TODD IN THE WHISPERING CAVE
JERRY TODD: PIRATE
JERRY TODD AND THE BOB-TAILED ELEPHANT
JERRY TODD: EDITOR-IN-GRIEF
JERRY TODD: CAVEMAN
JERRY TODD AND THE FLYING FLAPDOODLE
JERRY TODD AND THE BUFFALO BILL BATHTUB
JERRY TODD: UP THE LADDER CLUB
JERRY TODD’S POODLE PARLOR
THE POPPY OTT BOOKS

POPPY OTT AND THE STUTTERING PARROT


POPPY OTT’S SEVEN LEAGUE STILTS
POPPY OTT AND THE GALLOPING SNAIL
POPPY OTT’S PEDIGREED PICKLES
POPPY OTT AND THE FRECKLED GOLDFISH
POPPY OTT AND THE TITTERING TOTEM
POPPY OTT AND THE PRANCING PANCAKE
POPPY OTT HITS THE TRAIL
POPPY OTT & CO.: INFERIOR DECORATORS
POPPY OTT—THE MONKEYS PAW

GROSSET & DUNLAP Publishers NEW YORK [228]

[Contents]
Spotlight Books for Boys

Thrilling best-seller tales of mystery and adventure.

MYSTERY HOUSE R. J. Burrough


THE LONE RANGER Fran Striker
THE LONE RANGER AND THE
MYSTERY RANCH
THE LONE RANGER AND THE GOLD
ROBBERY
FLASH GORDON Alex Raymond
TAILSPIN TOMMY Mark Stevens
THE G-MEN SMASH THE Wm. Engle
“PROFESSOR’S” GANG
THE G-MEN IN JEOPARDY Laurence Dwight
Smith
SMILEY ADAMS R. J. Burrough
HAWK OF THE WILDERNESS William L. Chester
MYSTERY OF THE YELLOW TIE L. Dwight Smith
JIMMY DRURY: CANDID CAMERA David O’Hara
DETECTIVE
JIMMY DRURY: WHAT THE DARK
ROOM REVEALED
THE PONY EXPRESS Forman and
Woods
THE IRON HORSE Edwin C. Hill
GROSSET & DUNLAP : Publishers : New York [229]

[Contents]
Books for Boys by a Master of Fiction

The Mark Tidd Stories


By CLARENCE BUDINGTON KELLAND

MARK TIDD

An ingenious fat boy and his three friends meet danger


and excitement in solving the mystery of the strange
footprint in their secret cave.

MARK TIDD IN BUSINESS


Mark and his three friends take Smalley’s Bazaar and
make a success of it, in spite of unfair competition from
the villain of the story.

MARK TIDD, EDITOR

The resourceful fat boy runs a country newspaper. As


editor, foreman of the press room, circulation manager
and business manager, he makes the Wicksville Trumpet
a paying proposition.

MARK TIDD, MANUFACTURER


The boys take over an old mill fallen into disrepair and
soon have it showing a profit. How Mark outwits the
unscrupulous representative of a big power company
makes an irresistibly, funny book.

MARK TIDD IN THE BACKWOODS

Mark turns detective and foils a scheme to defraud his


pal’s uncle—an exciting story of mystery and fun.

MARK TIDD’S CITADEL

The boys run into mystery in a closed-up summer hotel


where they rescue a kidnapped Samurai boy from his
pursuers.

MARK TIDD IN ITALY

Here is fun and action aplenty and a story that will hold
Mark’s old friends and make many new ones.

GROSSET & DUNLAP : Publishers : New York [230]

[Contents]
TALES OF ADVENTURE IN THE GREAT
NORTHWEST

By JAMES OLIVER CURWOOD

THE GRIZZLY KING

The story of Thor, the biggest grizzly in the Rockies, and


the hunter who pursued but never shot him.

NOMADS OF THE NORTH

Neewa, the bear cub, and Miki, the pup, separated from
their master, grow up in the wilderness until, in the end,
they find him and bring to him the girl he loves.

SWIFT LIGHTNING

The adventures of a wolf in whose veins is a drop of dog


blood. His desperate combats and killings, and his
mating with a lost collie make a tale of breathless
suspense.

THE WOLF HUNTERS

A tenderfoot, a young Indian and their faithful guide


battle courageously with a savage band of outlaw
Indians in the Canadian wilderness.
THE GOLD HUNTERS

A search for a lost gold mine leads the three heroes of


“The Wolf Hunters” on a hazardous trail of mystery and
amazing adventure.

BACK TO GOD’S COUNTRY


The courage and devotion of Wapi, the wolf dog, saves
the life of a woman imprisoned on an ice-bound ship in
the Far North.

THE GOLDEN SNARE

Philip Raine, of the Royal Northwest Mounted Police,


taken prisoner by the murderer he is pursuing, finds
strange adventure with a half-mad wolf-man, a beautiful
girl and a courageous Swede.

GROSSET & DUNLAP Publishers NEW YORK [231]

[Contents]
WESTERN STORIES FOR BOYS

By JAMES CODY FERRIS

Each Volume Complete in Itself.

Thrilling tales of the great west, told primarily for boys


but which will be read by all who love mystery, rapid
action, and adventures in the great open spaces.

The cowboys of the X Bar X Ranch are real cowboys, on


the job when required, but full of fun and daring—a
bunch any reader will be delighted to know.

THE X BAR X BOYS ON THE RANCH


THE X BAR X BOYS IN THUNDER CANYON
THE X BAR X BOYS ON WHIRLPOOL RIVER
THE X BAR X BOYS ON BIG BISON TRAIL
THE X BAR X BOYS AT THE ROUND-UP
THE X BAR X BOYS AT NUGGET CAMP
THE X BAR X BOYS AT RUSTLER’S GAP
THE X BAR X BOYS AT GRIZZLY PASS
THE X BAR X BOYS LOST IN THE ROCKIES
THE X BAR X BOYS RIDING FOR LIFE
THE X BAR X BOYS IN SMOKY VALLEY
THE X BAR X BOYS AT COPPERHEAD GULCH
THE X BAR X BOYS BRANDING THE WILD HERD
THE X BAR X BOYS AT THE STRANGE RODEO
THE X BAR X BOYS WITH THE SECRET RANGERS
THE X BAR X BOYS HUNTING THE PRIZE
MUSTANGS
GROSSET & DUNLAP Publishers NEW YORK [232]

[Contents]
THE NANCY DREW MYSTERY STORIES

By CAROLYN KEENE

Illustrated. Every Volume Complete in Itself.

Here is a thrilling series of mystery stories for girls.


Nancy, Drew, ingenious, alert, is the daughter of a
famous criminal lawyer and she herself is deeply
interested in his mystery cases. Here interest involves
her often in some very dangerous and exciting situations.

THE SECRET OF THE OLD CLOCK


THE HIDDEN STAIRCASE
THE BUNGALOW MYSTERY
THE MYSTERY AT LILAC INN
THE SECRET AT SHADOW RANCH
THE SECRET OF RED GATE FARM
THE CLUE IN THE DIARY
NANCY’S MYSTERIOUS LETTER
THE SIGN OF THE TWISTED CANDLES
THE PASSWORD TO LARKSPUR LANE
THE CLUE OF THE BROKEN LOCKET
THE MESSAGE IN THE HOLLOW OAK
THE MYSTERY OF THE IVORY CHARM
THE WHISPERING STATUE
THE HAUNTED BRIDGE
GROSSET & DUNLAP : Publishers : New York
Colophon

Availability

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever. You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org ↗️.

This eBook is produced by the Online Distributed Proofreading Team


at www.pgdp.net ↗️.

Metadata

Title: Jerry Todd and the Talking Frog


Author: Leo Edwards (1884–1944) Info ↗️
Language: English
Original publication date: 1925

Revision History

2022-03-03 Started.

External References

This Project Gutenberg eBook contains external references. These


links may not work for you.

Corrections

The following corrections have been applied to the text:


Page Source Correction Edit
distance
74 [Not in source] “ 1
174 suspicous suspicious 1
182 county country 1
207 pur purr 1
229 kidnaped kidnapped 1
*** END OF THE PROJECT GUTENBERG EBOOK JERRY TODD AND
THE TALKING FROG ***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States
copyright in these works, so the Foundation (and you!) can copy
and distribute it in the United States without permission and
without paying copyright royalties. Special rules, set forth in the
General Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree
to abide by all the terms of this agreement, you must cease
using and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project Gutenberg™
works in compliance with the terms of this agreement for
keeping the Project Gutenberg™ name associated with the
work. You can easily comply with the terms of this agreement
by keeping this work in the same format with its attached full
Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the


United States and most other parts of the world at no
cost and with almost no restrictions whatsoever. You may
copy it, give it away or re-use it under the terms of the
Project Gutenberg License included with this eBook or
online at www.gutenberg.org. If you are not located in
the United States, you will have to check the laws of the
country where you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of
the copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute


this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except


for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you


discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission


of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About


Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebooknice.com

You might also like