Machine learning refined foundations algorithms and applications Second Edition Borhani download
Machine learning refined foundations algorithms and applications Second Edition Borhani download
https://textbookfull.com/product/machine-learning-refined-
foundations-algorithms-and-applications-second-edition-borhani/
https://textbookfull.com/product/machine-learning-refined-
foundations-algorithms-and-applications-second-edition-jeremy-
watt/
https://textbookfull.com/product/foundations-of-machine-learning-
second-edition-mehryar-mohri/
https://textbookfull.com/product/machine-learning-algorithms-for-
industrial-applications-santosh-kumar-das/
https://textbookfull.com/product/machine-learning-foundations-
supervised-unsupervised-and-advanced-learning-taeho-jo/
Pro Machine Learning Algorithms V Kishore Ayyadevara
https://textbookfull.com/product/pro-machine-learning-algorithms-
v-kishore-ayyadevara/
https://textbookfull.com/product/analysis-for-computer-
scientists-foundations-methods-and-algorithms-second-edition-
oberguggenberger/
https://textbookfull.com/product/learning-microsoft-cognitive-
services-leverage-machine-learning-apis-to-build-smart-
applications-second-edition-edition-larsen/
https://textbookfull.com/product/machine-learning-and-security-
protecting-systems-with-data-and-algorithms-first-edition-chio/
https://textbookfull.com/product/machine-learning-and-its-
applications-1st-edition-peter-wlodarczak/
Machine Learning Refined
With its intuitive yet rigorous approach to machine learning, this text provides students
with the fundamental knowledge and practical tools needed to conduct research and
build data-driven products. The authors prioritize geometric intuition and algorithmic
thinking, and include detail on all the essential mathematical prerequisites, to offer a
fresh and accessible way to learn. Practical applications are emphasized, with examples
from disciplines including computer vision, natural language processing, economics,
neuroscience, recommender systems, physics, and biology. Over 300 color illustra-
tions are included and have been meticulously designed to enable an intuitive grasp
of technical concepts, and over 100 in-depth coding exercisesPython
(in ) provide a
real understanding of crucial machine learning algorithms. A suite of online resources
including sample code, data sets, interactive lecture slides, and a solutions manual are
provided online, making this an ideal text both for graduate courses on machine learning
and for individual reference and self-study.
Jeremy Watt received his PhD in Electrical Engineering from Northwestern University,
and is now a machine learning consultant and educator. He teaches machine learning,
deep learning, mathematical optimization, and reinforcement learning at Northwestern
University.
Reza Borhani received his PhD in Electrical Engineering from Northwestern University,
and is now a machine learning consultant and educator. He teaches a variety of courses
in machine learning and deep learning at Northwestern University.
where he heads the Image and Video Processing Laboratory. He is a Fellow of IEEE,
SPIE, EURASIP, and OSA and the recipient of the IEEE Third Millennium Medal
(2000).
Machine Learning Refined
J E R E M Y W AT T
Northwestern University, Illinois
REZA BORHANI
Northwestern University, Illinois
A G G E L O S K . K AT S A G G E L O S
Northwestern University, Illinois
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906
www.cambridge.org
Information on this title:
www.cambridge.org/9781108480727
DOI: 10.1017/9781108690935
© Cambridge University Press 2020
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2020
Printed and bound in Great Britain by Clays Ltd, Elcograf S.p.A.
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-48072-7 Hardback
Additional resources for this publication www.cambridge.org/watt2
at
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
To our families:
Preface pagexii
Acknowledgements xxii
1 Introduction to Machine Learning 1
1.1 Introduction 1
1.2 Distinguishing Cats from Dogs: a Machine Learning Approach 1
1.3 The Basic Taxonomy of Machine Learning Problems 6
1.4 Mathematical Optimization 16
1.5 Conclusion 18
Part I Mathematical Optimization 19
2 Zero-Order Optimization Techniques 21
2.1 Introduction 21
2.2 The Zero-Order Optimality Condition 23
2.3 Global Optimization Methods 24
2.4 Local Optimization Methods 27
2.5 Random Search 31
2.6 Coordinate Search and Descent 39
2.7 Conclusion 40
2.8 Exercises 42
3 First-Order Optimization Techniques 45
3.1 Introduction 45
3.2 The First-Order Optimality Condition 45
3.3 The Geometry of First-Order Taylor Series 52
3.4 Computing Gradients Efficiently 55
3.5 Gradient Descent 56
3.6 Two Natural Weaknesses of Gradient Descent 65
3.7 Conclusion 71
3.8 Exercises 71
4 Second-Order Optimization Techniques 75
4.1 The Second-Order Optimality Condition 75
viii Contents
ffi
11.4 Naive Cross-Validation 335
References 564
Index 569
Preface
For eons we humans have sought out rules or patterns that accurately describe
how important systems in the world around us work, whether these systems
and ultimately, control it. However, the process of finding the ”right” rule that
seems to govern a given system has historically been no easy task. For most of
our history data (glimpses of a given system at work) has been an extremely
scarce commodity. Moreover, our ability to compute, to try out various rules
the range of phenomena scientific pioneers of the past could investigate and
inevitably forced them to use philosophical and /or visual approaches to rule-
finding. Today, however, we live in a world awash in data, and have colossal
great pioneers can tackle a much wider array of problems and take a much more
the topic of this textbook, is a term used to describe a broad (and growing)
In the past decade the user base of machine learning has grown dramatically.
matics departments the users of machine learning now include students and
of machine learning into its most fundamental components, and a curated re-
will most benefit this broadening audience of learners. It contains fresh and
Book Overview
The second edition of this text is a complete revision of our first endeavor, with
virtually every chapter of the original rewritten from the ground up and eight
new chapters of material added, doubling the size of the first edition. Topics from
All classification and Principal Component Analysis have been reworked and
polished. A swath of new topics have been added throughout the text, from
While heftier in size, the intent of our original attempt has remained un-
only the tuning of individual machine learning models (introduced in Part II)
in Chapters 3 and 4, respectively. More specifically this part of the text con-
vised and unsupervised learning in Chapter 10, where we introduce the motiva-
machine learning: fixed-shape kernels, neural networks, and trees, where we discuss
universal approximator.
To get the most out of this part of the book we strongly recommend that
Chapter 11 and the fundamental ideas therein are studied and understood before
of subjects that the readers will need to understand in order to make full use of
the text.
enhancements in various ways (producing e.g., the RMSProp and Adam first
/
tion to the derivative gradient, higher-order derivatives, the Hessian matrix,
/
cluding vector matrix arithmetic, the notions of spanning sets and orthogonality,
well as for more knowledgeable readers who yearn for a more intuitive and
serviceable treatment than what is currently available today. To make full use of
the text one needs only a basic understanding of vector algebra (mathematical
for navigating the text based on a variety of learning outcomes and university
topics – as described further under ”Instructors: How to use this Book” below).
We believe that intuitive leaps precede intellectual ones, and to this end defer
fresh and consistent geometric perspective throughout the text. We believe that
ual concepts in the text, but also that it helps establish revealing connections
between ideas often regarded as fundamentally distinct (e.g., the logistic re-
gression and Support Vector Machine classifiers, kernels and fully connected
cises, allowing them to ”get their hands dirty” and ”learn by doing,” practicing
the concepts introduced in the body of the text. While in principle any program-
ming language can be used to complete the text’s coding exercises, we highly
recommend using Python for its ease of use and large support community. We
also recommend using the open-source Python libraries NumPy, autograd, and
matplotlib, as well as the Jupyter notebook editor to make implementing and
testing code easier. A complete set of installation instructions, datasets, as well
https://github.com/jermwatt/machine_learning_refined
xvi Preface
at
https://github.com/jermwatt/machine_learning_refined
This site also contains instructions for installing Python as well as a number
of other free packages that students will find useful in completing the text’s
exercises.
This book has been used as a basis for a number of machine learning courses
optimization and deep learning for graduate students. With its treatment of
quarter-based programs and universities where a deep dive into the entirety
of the book is not feasible due to time constraints. Topics for such a course
on this text expands on the essentials course outlined above both in terms
Figure 0.2.
Preface xvii
optimization techniques from Part I of the text (as well as Appendix A) in-
All students in general, and those taking an optimization for machine learning
in identifying the ”right” nonlinearity via the processes of boosting and regular-
/
batch normalization, and foward backward mode of automatic di ff erentiation
– can also be covered. A recommended roadmap for such a course – including
0.3.
able for students who have had prior exposure to fundamental machine learning
concepts, and can begin with a discussion of appropriate first order optimiza-
of machine learning may be needed using selected portions of Part II of the text.
/
backpropagation and forward backward mode of automatic di fferentiation, as
well as special topics like batch normalization and early-stopping-based cross-
validation, can then be made using Chapters 11, 13 , and Appendices A and B of
ing – like convolutional and recurrent networks – can be found by visiting the
1 2 3 4 5
Machine Learning Taxonomy
1
1 2 3 4 5
2 Global/Local Optimization Curse of Dimensionality
1 2 3 4 5
3 Gradient Descent
1 2
5 Least Squares Linear Regression
1 2 3 5 6 8
6 Logistic Regression Cross Entropy/Softmax Cost SVMs
1 2 3 4 6
7 One-versus-All Multi-Class Logistic Regression
1 2 3 5
Principal Component Analysis K-means
8
2 7
Feature Engineering Feature Selection
9
1 2 4
Nonlinear Regression Nonlinear Classification
10
1 2 3 4 6 7 9
11 Universal Approximation Cross-Validation Regularization
Ensembling Bagging
1 2 3
Kernel Methods The Kernel Trick
12
1 2 4
Fully Connected Networks Backpropagation
13
1 2 3 4
14 Regression Trees Classification Trees
Figure 0.1 Recommended study roadmap for a course on the essentials of machine
learning, including requisite chapters (left column), sections (middle column), and
where machine learning is not the sole focus but a key component of some broader
course of study. Note that chapters are grouped together visually based on text layout
detailed under ”Book Overview” in the Preface. See the section titled ”Instructors: How
1 2 3 4 5
1 Machine Learning Taxonomy
1 2 3 4 5
Global/Local Optimization Curse of Dimensionality
2
1 2 3 4 5
3 Gradient Descent
1 2 3
4 Newton’s method
1 2 3 4 5 6
5 Least Squares Linear Regression Least Absolute Deviations
1 2 3 4 5 6 7 8 9 10
6 Logistic Regression Cross Entropy/Softmax Cost The Perceptron
1 2 3 4 5 6 7 8 9
7 One-versus-All Multi-Class Logistic Regression
1 2 3 4 5 6 7
PCA K-means Recommender Systems Matrix Factorization
8
1 2 3 6 7
Feature Engineering Feature Selection Boosting Regularization
9
1 2 3 4 5 6 7
Nonlinear Supervised Learning Nonlinear Unsupervised Learning
10
1 2 3 4 5 6 7 8 9 10 11 12
Universal Approximation Cross-Validation Regularization
11
Ensembling Bagging K-Fold Cross-Validation
1 2 3 4 5 6 7
Kernel Methods The Kernel Trick
12
1 2 3 4 5 6 7 8
Fully Connected Networks Backpropagation Activation Functions
13
Batch Normalization Early Stopping
1 2 3 4 5 6 7 8
14 Regression/Classification Trees Gradient Boosting Random Forests
Figure 0.2 Recommended study roadmap for a full treatment of standard machine
This plan entails a more in-depth coverage of machine learning topics compared to the
essentials roadmap given in Figure 0.1, and is best suited for senior undergraduate/early
the section titled ”Instructors: How To Use This Book” in the Preface for further details.
xx Preface
1 2 3 4 5
Machine Learning Taxonomy
1
1 2 3 4 5 6 7
2 Global/Local Optimization Curse of Dimensionality
1 2 3 4 5 6 7
3 Gradient Descent
1 2 3 4 5
Newton’s Method
4
6
8
Online Learning
7
8
3 4 5
Feature Scaling PCA-Sphering Missing Data Imputation
9
10
5 6
Regularization
11 Boosting
12
6
13 Batch Normalization
14
1 2 3 4 5 6 7 8
Momentum Acceleration Normalized Schemes: Adam, RMSProp
A
Fixed Lipschitz Steplength Rules Backtracking Line Search
1 2 3 4 5 6 7 8 9 10
Forward/Backward Mode of Automatic Differentiation
B
for machine learning and deep learning, including chapters, sections, as well as topics
to cover. See the section titled ”Instructors: How To Use This Book” in the Preface for
further details.
Preface xxi
2
1 2 3 4 5 6 7
3 Gradient Descent
1 2 3 4 5
10 Nonlinear Regression Nonlinear Classification Nonlinear Autoencoder
1 2 3 4 6
11 Universal Approximation Cross-Validation Regularization
12
1 2 3 4 5 6 7 8
13 Fully Connected Networks Backpropagation Activation Functions
14
1 2 3 4 5 6
A Momentum Acceleration Normalized Schemes: Adam, RMSProp
Stochastic/Mini-Batch Optimization
1 2 3 4 5 6 7 8 9 10
B Forward/Backward Mode of Automatic Differentiation
deep learning, including chapters, sections, as well as topics to cover. See the section
titled ”Instructors: How To Use This Book” in the Preface for further details.
Acknowledgements
This text could not have been written in anything close to its current form
new ideas included in the second edition of this text that greatly improved it as
We are also very grateful for the many students over the years that provided
insightful feedback on the content of this text, with special thanks to Bowen
the work.
Finally, a big thanks to Mark McNess Rosengren and the entire Standing
Passengers crew for helping us stay ca ffeinated during the writing of this text.
1 Introduction to Machine
Learning
1.1 Introduction
Machine learning is a unified algorithmic framework designed to identify com-
putational models that accurately describe empirical data and the phenomena
underlying it, with little or no human involvement. While still a young dis-
cipline with much more awaiting discovery than is currently known, today
analytics (leveraged for sales and economic forecasting), to just name a few.
tures of cats from those with dogs. This will allow us to informally describe the
problem.
Do you recall how you first learned about the di ff erence between cats and
dogs, and how they are di ff erent animals? The answer is probably no, as most
humans learn to perform simple cognitive tasks like this very early on in the
course of their lives. One thing is certain, however: young children do not need
some kind of formal scientific training, or a zoological lecture on felis catus and
canis familiaris species, in order to be able to tell cats and dogs apart. Instead,
they learn by example. They are naturally presented with many images of
what they are told by a supervisor (a parent, a caregiver, etc.) are either cats
or dogs, until they fully grasp the two concepts. How do we know when a
child can successfully distinguish between cats and dogs? Intuitively, when
2 Introduction to Machine Learning
they encounter new (images of) cats and dogs, and can correctly identify each
new example or, in other words, when they can generalize what they have learned
Like human beings, computers can be taught how to perform this sort of task
distinguish between di ff erent types or classes of things (here cats and dogs) is
the diff erence between these two types of animals by learning from a batch of
examples, typically referred to as a training set of data. Figure 1.1 shows such a
training set consisting of a few images of di fferent cats and dogs. Intuitively, the
larger and more diverse the training set the better a computer (or human) can
Figure 1.1 A training set consisting of six images of cats (highlighted in blue) and six
images of dogs (highlighted in red). This set is used to train a machine learning model
that can distinguish between future images of cats and dogs. The images in this figure
2. Feature design. Think for a moment about how we (humans) tell the di ff erence
between images containing cats from those containing dogs. We use color, size,
/
the shape of the ears or nose, and or some combination of these features in order
to distinguish between the two. In other words, we do not just look at an image
as simply a collection of many small square pixels. We pick out grosser details,
or features, from images like these in order to identify what it is that we are
looking at. This is true for computers as well. In order to successfully train a
computer to perform this task (and any machine learning task more generally)
1.2 Distinguishing Cats from Dogs: a Machine Learning Approach 3
we need to provide it with properly designed features or, ideally, have it find or
Designing quality features is typically not a trivial task as it can be very ap-
plication dependent. For instance, a feature like color would be less helpful in
discriminating between cats and dogs (since many cats and dogs share similar
hair colors) than it would be in telling grizzly bears and polar bears apart! More-
over, extracting the features from a training dataset can also be challenging. For
example, if some of our training images were blurry or taken from a perspective
where we could not see the animal properly, the features we designed might
However, for the sake of simplicity with our toy problem here, suppose we
can easily extract the following two features from each image in the training set:
size of nose relative to the size of the head, ranging from small to large, and shape
Figure 1.2 Feature space representation of the training set shown in Figure 1.1 where
the horizontal and vertical axes represent the features nose size and ear shape,
respectively. The fact that the cats and dogs from our training set lie in distinct regions
Examining the training images shown in Figure 1.1 , we can see that all cats
have small noses and pointy ears, while dogs generally have large noses and
round ears. Notice that with the current choice of features each image can now
be represented by just two numbers: a number expressing the relative nose size,
and another number capturing the pointiness or roundness of the ears. In other
feature space where the features nose size and ear shape are the horizontal and
3. Model training. With our feature representation of the training data the
simple geometric one: have the machine find a line or a curve that separates
the cats from the dogs in our carefully designed feature space. Supposing for
simplicity that we use a line, we must find the right values for its two parameters
– a slope and vertical intercept – that define the line’s orientation in the feature
and the tuning of such a set of parameters to a training set is referred to as the
training of a model.
Figure 1.3 shows a trained linear model (in black) which divides the feature
space into cat and dog regions. This linear model provides a simple compu-
tational rule for distinguishing between cats and dogs: when the feature rep-
resentation of a future image lies above the line (in the blue region) it will be
considered a cat by the machine, and likewise any representation that falls below
Figure 1.3 A trained linear model (shown in black) provides a computational rule for
distinguishing between cats and dogs. Any new image received in the future will be
classified as a cat if its feature representation lies above this line (in the blue region), and
a dog if the feature representation lies below this line (in the red region).
1.2 Distinguishing Cats from Dogs: a Machine Learning Approach 5
Figure 1.4 A validation set of cat and dog images (also taken from [1]). Notice that the
images in this set are not highlighted in red or blue (as was the case with the training set
shown in Figure 1.1) indicating that the true identity of each image is not revealed to the
learner. Notice that one of the dogs, the Boston terrier in the bottom right corner, has
both a small nose and pointy ears. Because of our chosen feature representation the
4. Model validation. To validate the e fficacy of our trained learner we now show
the computer a batch of previously unseen images of cats and dogs, referred to
generally as a validation set of data, and see how well it can identify the animal
in each image. In Figure 1.4 we show a sample validation set for the problem at
hand, consisting of three new cat and dog images. To do this, we take each new
image, extract our designed features (i.e., nose size and ear shape), and simply
check which side of our line (or classifier) the feature representation falls on. In
this instance, as can be seen in Figure 1.5, all of the new cats and all but one dog
from the validation set have been identified correctly by our trained model.
The misidentification of the single dog (a Boston terrier) is largely the result
of our choice of features, which we designed based on the training set in Figure
1.1, and to some extent our decision to use a linear model (instead of a nonlinear
one). This dog has been misidentified simply because its features, a small nose
and pointy ears, match those of the cats from our training set. Therefore, while
it first appeared that a combination of nose size and ear shape could indeed
distinguish cats from dogs, we now see through validation that our training set
was perhaps too small and not diverse enough for this choice of features to be
should collect more data, forming a larger and more diverse training set. Second,
/
we can consider designing including more discriminating features (perhaps eye
color, tail shape, etc.) that further help distinguish cats from dogs using a linear
model. Finally, we can also try out (i.e., train and validate) an array of nonlinear
models with the hopes that a more complex rule might better distinguish be-
tween cats and dogs. Figure 1.6 compactly summarizes the four steps involved
pointy
ear shape
round
Figure 1.5 Identification of (the feature representation of) validation images using our
trained linear model. The Boston terrier (pointed to by an arrow) is misclassified as a cat
since it has pointy ears and a small nose, just like the cats in our training set.
Training set
Validation set
Figure 1.6 The schematic pipeline of our toy cat-versus-dog classification problem. The
same general pipeline is used for essentially all machine learning problems.
fall into two main categories called supervised and unsupervised learning, which
we discuss next.
1.3 The Basic Taxonomy of Machine Learning Problems 7
1.2) refer to the automatic learning of computational rules involving input /out-
put relationships. Applicable to a wide array of situations and data types, this
type of problem comes in two forms, called regression and classification, depend-
Regression
Suppose we wanted to predict the share price of a company that is about to
the same domain) with known share prices. Next, we need to design feature(s)
that are thought to be relevant to the task at hand. The company’s revenue is one
such potential feature, as we can expect that the higher the revenue the more
expensive a share of stock should be. To connect the share price (output) to the
revenue (input) we can train a simple linear model or regression line using our
training data.
share price
share price
revenue revenue
share price
share price
revenue revenue
Figure 1.7 (top-left panel) A toy training dataset consisting of ten corporations’ share
price and revenue values. (top-right panel) A linear model is fit to the data. This trend
line models the overall trajectory of the points and can be used for prediction in the
The top panels of Figure 1.7 show a toy dataset comprising share price versus
revenue information for ten companies, as well as a linear model fit to this data.
Once the model is trained, the share price of a new company can be predicted
8 Introduction to Machine Learning
based on its revenue, as depicted in the bottom panels of this figure. Finally,
comparing the predicted price to the actual price for a validation set of data
we can test the performance of our linear regression model and apply changes
as needed, for example, designing new features (e.g., total assets, total equity,
number of employees, years active, etc.) and/or trying more complex nonlinear
models.
This sort of task, i.e., fitting a model to a set of training data so that predictions
linear case, and move to nonlinear models starting in Chapter 10 and throughout
Example 1.1 The rise of student loan debt in the United States
Figure 1.8 (data taken from [2]) shows the total student loan debt (that is money
borrowed by students to pay for college tuition, room and board, etc.) held
by citizens of the United States from 2006 to 2014, measured quarterly. Over
the eight-year period reflected in this plot the student debt has nearly tripled,
totaling over one trillion dollars by the end of 2014. The regression line (in
black) fits this dataset quite well and, with its sharp positive slope, emphasizes
the point that student debt is rising dangerously fast. Moreover, if this trend
continues, we can use the regression line to predict that total student debt will
surpass two trillion dollars by the year 2026 (we revisit this problem later in
Exercise 5.1).
[in trillions of dollars]
student debt
year
Figure 1.8 Figure associated with Example 1.1, illustrating total student loan debt in the
United States measured quarterly from 2006 to 2014. The rapid increase rate of the debt,
measured by the slope of the trend line fit to the data, confirms that student debt is
Norway Pine
NORWAY PINE
(Pinus Resinosa)
arly explorers who were not botanists mistook this tree for Norway
E spruce, and gave it the name which has since remained in nearly all
parts of its range. It is called red pine also, and this name is strictly
descriptive. The brown or red color of the bark is instantly noticed by
one who sees the tree for the first time. In the Lake States it has
been called hard pine for the purpose of distinguishing it from the softer
white pine with which it is associated. In England they call it Canadian red
pine, because the principal supply in England is imported from the
Canadian provinces.
Its chief range lies in the drainage basin of the St. Lawrence river, which
includes the Great Lakes and the rivers which flow into them.
Newfoundland forms the eastern and Manitoba the western outposts of
this species. It is found as far south as Massachusetts, Pennsylvania,
northern Ohio, central Michigan, Wisconsin, and Minnesota. It conforms
pretty generally to the range of white pine but does not accompany that
species southward along the Appalachian mountain ranges across West
Virginia, Virginia, Kentucky, and Tennessee. Where it was left to compete
in nature’s way with white pine, the contest was friendly, but white pine
got the best of it. The two species grew in intermixture, but in most
instances white pine had from five to twenty trees to Norway’s one. As a
survivor under adversity, however, the Norway pine appears to surpass its
great friendly rival, at least in the Lake States where the great pineries
once flourished and have largely passed away. Solitary or small clumps of
Norway pines are occasionally found where not a white pine, large or
small, is in sight.
The forest appearance of Norway pine resembles the southern yellow
pines. The stand is open, the trunks are clean and tall, the branches are at
the top. The Norway’s leaves are in clusters of two, and are five or six
inches long. They fall during the fourth or fifth year. Cones are two inches
long, and when mature, closely resemble the color of the tree’s bark, that
is, light chestnut brown. Exceptionally tall Norway pines may reach a
height of 150 feet, but the average is seventy or eighty, with diameters of
from two to four. Young trees are limby, but early in life the lower branches
die and fall, leaving few protruding stubs or knots. It appears to be a
characteristic that trunks are seldom quite straight. They do not have the
plumb appearance of forest grown white pine and spruce.
The wood of Norway pine is medium light, its strength and stiffness about
twenty-five per cent greater than white pine, and it is moderately soft. The
annual rings are rather wide, indicating rapid growth. The bands of
summerwood are narrow compared with the springwood, which gives a
generally light color to the wood, though not as light as the wood of white
pine. The resin passages are small and fairly numerous. The sapwood is
thick, and the wood is not durable in contact with the soil.
Norway pine has always had a place of its own in the lumber trade, but
large quantities have been marketed as white pine. If such had not been
the case, Norway pine would have been much oftener heard of during the
years when the Lake State pineries were sending their billions of feet of
lumber to the markets of the world.
Because of the deposit of resinous materials in the wood, Norway pine
stumps resist decay much better than white pine. In some of the early
cuttings in Michigan, where only stumps remain to show how large the
trees were and how thick they stood, the Norway stumps are much better
preserved than the white pine. Using that fact as a basis of estimate, it
may be shown that in many places the Norway pine constituted one-fifth
or one-fourth of the original stand. The lumbermen cut clean, and
statistics of that period do not show that the two pines were generally
marketed separately. In recent years many of the Norway stumps have
been pulled, and have been sold to wood-distillation plants where the rosin
and turpentine are extracted.
At an early date Norway pine from Canada and northern New York was
popular ship timber in this country and England. Slender, straight trunks
were selected as masts, or were sawed for decking planks thirty or forty
feet long. Shipbuilders insisted that planks be all heartwood, because
when sapwood was exposed to rain and sun, it changed to a green color,
due to the presence of fungus. The wood wears well as ship decking. The
British navy was still using some Norway pine masts as late as 1875.
The scarcity of this timber has retired it from some of the places which it
once filled, and the southern yellow pines have been substituted. It is still
employed for many important purposes, the chief of which is car building,
if statistics for the state of Illinois are a criterion for the whole country. In
1909 in that state 24,794,000 feet of it were used for all purposes, and
14,783,000 feet in car construction.
For many years Chicago has been the center of the Norway pine trade. It
is landed there by lake steamers and by rail, and is distributed to ultimate
consumers. The uses for the wood, as reported by Illinois manufacturers,
follow: Baskets, boxes, boats, brackets, casing and frames for doors and
windows, crating, derricks for well-boring machines, doors, elevators,
fixtures for stores and offices, foot or running boards for tank cars, foundry
flasks, freight cars, hand rails, insulation for refrigerator cars, ladders,
picture moldings, roofing, sash, siding for cattle cars, sign boards and
advertising signs, tanks, and windmill towers.
As with white pine, Norway pine has passed the period of greatest
production, though much still goes to market every year and will long
continue to do so. The land which lumbermen denuded in the Lake States,
particularly Michigan and Wisconsin, years ago, did not reclothe itself with
Norway seedlings. That would have taken place in most instances but for
fires which ran periodically through the slashings until all seedlings were
destroyed. In many places there are now few seedlings and few large trees
to bear seeds, and consequently the pine forest in such places is a thing of
the past. The outlook is better in other localities.
The Norway pine is much planted for ornament, and is rated one of the
handsomest of northern park trees.
Pitch Pine (Pinus rigida). The name pitch pine is locally applied to almost every species of
hard, resinous pine in this country. The Pinus rigida has other names than pitch pine. In
Delaware it is called longleaved pine, since its needles are longer than the scrub pine’s
with which it is associated. For the same reason it is known in some localities as longschat
pine. In Massachusetts it is called hard pine, in Pennsylvania yellow pine, in North
Carolina and eastern Tennessee black pine, and black Norway pine in New York. The
botanical name is translated “rigid pine,” but the rigid refers to the leaves, not the wood.
Its range covers New England, New York, Pennsylvania, southern Canada, eastern Ohio,
and southward along the mountains to northern Georgia. It has three leaves in a cluster,
from three to five inches long, and they fall the second year. Cones range in length from
one to three inches, and they hang on the branches ten or twelve years. The wood is
medium light, moderately strong, but low in stiffness. It is soft and brittle. The annual
rings are wide, the summerwood broad, distinct, and very resinous. Medullary rays are
few but prominent; color, light brown or red, the thick sapwood yellow or often nearly
white. The difference in the hardness between springwood and summerwood renders it
difficult to work, and causes uneven wear when used as flooring. It is fairly durable in
contact with the soil.
The tree attains a height of from forty to eighty feet and a diameter of three. This pine is
not found in extensive forests, but in scattered patches, nearly always on poor soil where
other trees will not crowd it. Light and air are necessary to its existence. If it receives
these, it will fight successfully against adversities which would be fatal to many other
species. In resistance to forest fires, it is a salamander among trees. That is primarily due
to its thick bark, but it is favored also by the situations in which it is generally found—
open woods, and on soil so poor that ground litter is thin. It is a useful wood for many
purposes, and wherever it is found in sufficient quantity, it goes to market, but under its
own name only in restricted localities. Its resinous knots were once used in place of
candles in frontier homes. Tar made locally from its rich wood was the pioneer wagoner’s
axle grease, and the ever-present tar bucket and tar paddle swung from the rear axle.
Torches made by tying splinters in bundles answered for lanterns in night travel. It was
the best pine for floors in some localities. It is probably used more for boxes than for
anything else at present. In 1909 Massachusetts box makers bought 600,000 feet, and a
little more went to Maryland box factories. Its poor holding power on spikes limits its
employment as railroad ties and in shipbuilding. Carpenters and furniture makers object
to the numerous knots. Country blacksmiths who repair and make wagons as a side line,
find it suitable for wagon beds. It is much used as fuel where it is convenient.
Torrey Pine (Pinus torreyana), called del mar pine and Soledad pine, is an interesting tree
from the fact that its range is so restricted that the actual number of trees could be easily
known to one who would take the trouble to count them. A rather large quantity formerly
occupied a small area in San Diego county, California, but woodchoppers who did not
appreciate the fact that they were exterminating a species of pine from the face of the
earth, cut nearly all of the trees for fuel. Its range covered only a few square miles, and
fortunately part of that was included in the city limits of San Diego. An ordinance was
passed prohibiting the cutting of a Torrey pine under heavy penalty, and the tree was thus
saved. A hundred and fifty miles off the San Diego coast a few Torrey pines grow on the
islands of Santa Cruz and Santa Rosa, and owing to their isolated situation they bid fair to
escape the cordwood cutter for years to come. Those who have seen this tree on its
native hills have admired the gameness of its battle for existence against the elements.
Standing in the full sweep of the ocean winds, its strong, short branches scarcely move,
and all the agitation is in the thick tufts of needles which cling to the ends of the
branches. Trees exposed to the seawinds are stunted, and are generally less than a foot
in diameter and thirty feet high; but those which are so fortunate as to occupy sheltered
valleys are three or four times that size. The needles are five in a cluster. The cones
persist on the branches three or four years. The wood is light, soft, moderately strong,
very brittle; the rings of yearly growth are broad, and the yellow bands of summerwood
occupy nearly half. The sapwood is very thick and is nearly white.
WESTERN YELLOW PINE
Lodgepole Pine
LODGEPOLE PINE
(Pinus Contorta)
he common name of this tree was given it because its tall, slender,
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com