Full Download Hands On Machine Learning with Scikit Learn and TensorFlow Concepts Tools and Techniques to Build Intelligent Systems 1st Edition by Aurelien Geron ISBN 1491962291 9781491962299 PDF DOCX
Full Download Hands On Machine Learning with Scikit Learn and TensorFlow Concepts Tools and Techniques to Build Intelligent Systems 1st Edition by Aurelien Geron ISBN 1491962291 9781491962299 PDF DOCX
com
OR CLICK HERE
DOWLOAD EBOOK
ebookball.com
ebookball.com
Hands On Machine Learning for Cybersecurity Safeguard your
system by making your machines intelligent using the
Python ecosystem 1st edition by Soma Halder, Sinan Ozdemir
9781788990967 178899096X
https://ebookball.com/product/hands-on-machine-learning-for-
cybersecurity-safeguard-your-system-by-making-your-machines-
intelligent-using-the-python-ecosystem-1st-edition-by-soma-halder-
sinan-ozdemir-9781788990967-178899096x-18658/
ebookball.com
ebookball.com
Hands-On
Machine Learning
with Scikit-Learn
& TensorFlow
CONCEPTS, TOOLS, AND TECHNIQUES
TO BUILD INTELLIGENT SYSTEMS
Aurélien Géron
Hands-On Machine Learning with
Scikit-Learn and TensorFlow
Concepts, Tools, and Techniques to
Build Intelligent Systems
Aurélien Géron
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Hands-On Machine Learning with
Scikit-Learn and TensorFlow, the cover image, and related trade dress are trademarks of O’Reilly Media,
Inc.
While the publisher and the author have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the author disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
978-1-491-96229-9
[LSI]
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
iii
Check the Assumptions 40
Get the Data 40
Create the Workspace 40
Download the Data 43
Take a Quick Look at the Data Structure 45
Create a Test Set 49
Discover and Visualize the Data to Gain Insights 53
Visualizing Geographical Data 53
Looking for Correlations 55
Experimenting with Attribute Combinations 58
Prepare the Data for Machine Learning Algorithms 59
Data Cleaning 60
Handling Text and Categorical Attributes 62
Custom Transformers 64
Feature Scaling 65
Transformation Pipelines 66
Select and Train a Model 68
Training and Evaluating on the Training Set 68
Better Evaluation Using Cross-Validation 69
Fine-Tune Your Model 71
Grid Search 72
Randomized Search 74
Ensemble Methods 74
Analyze the Best Models and Their Errors 74
Evaluate Your System on the Test Set 75
Launch, Monitor, and Maintain Your System 76
Try It Out! 77
Exercises 77
3. Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
MNIST 79
Training a Binary Classifier 82
Performance Measures 82
Measuring Accuracy Using Cross-Validation 83
Confusion Matrix 84
Precision and Recall 86
Precision/Recall Tradeoff 87
The ROC Curve 91
Multiclass Classification 93
Error Analysis 96
Multilabel Classification 100
Multioutput Classification 101
iv | Table of Contents
Exercises 102
Table of Contents | v
6. Decision Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Training and Visualizing a Decision Tree 167
Making Predictions 169
Estimating Class Probabilities 171
The CART Training Algorithm 171
Computational Complexity 172
Gini Impurity or Entropy? 172
Regularization Hyperparameters 173
Regression 175
Instability 177
Exercises 178
vi | Table of Contents
Kernel PCA 218
Selecting a Kernel and Tuning Hyperparameters 219
LLE 221
Other Dimensionality Reduction Techniques 223
Exercises 224
Table of Contents | ix
Distributing a Deep RNN Across Multiple GPUs 397
Applying Dropout 399
The Difficulty of Training over Many Time Steps 400
LSTM Cell 401
Peephole Connections 403
GRU Cell 404
Natural Language Processing 405
Word Embeddings 405
An Encoder–Decoder Network for Machine Translation 407
Exercises 410
x | Table of Contents
Exercises 469
Thank You! 470
D. Autodiff. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Table of Contents | xi
Preface
xiii
Or maybe your company has tons of data (user logs, financial data, production data,
machine sensor data, hotline stats, HR reports, etc.), and more than likely you could
unearth some hidden gems if you just knew where to look; for example:
• Segment customers and find the best marketing strategy for each group
• Recommend products for each client based on what similar clients bought
• Detect which transactions are likely to be fraudulent
• Predict next year’s revenue
• And more
Whatever the reason, you have decided to learn Machine Learning and implement it
in your projects. Great idea!
• Scikit-Learn is very easy to use, yet it implements many Machine Learning algo‐
rithms efficiently, so it makes for a great entry point to learn Machine Learning.
• TensorFlow is a more complex library for distributed numerical computation
using data flow graphs. It makes it possible to train and run very large neural net‐
works efficiently by distributing the computations across potentially thousands
of multi-GPU servers. TensorFlow was created at Google and supports many of
their large-scale Machine Learning applications. It was open-sourced in Novem‐
ber 2015.
xiv | Preface
Prerequisites
This book assumes that you have some Python programming experience and that you
are familiar with Python’s main scientific libraries, in particular NumPy, Pandas, and
Matplotlib.
Also, if you care about what’s under the hood you should have a reasonable under‐
standing of college-level math as well (calculus, linear algebra, probabilities, and sta‐
tistics).
If you don’t know Python yet, http://learnpython.org/ is a great place to start. The offi‐
cial tutorial on python.org is also quite good.
If you have never used Jupyter, Chapter 2 will guide you through installation and the
basics: it is a great tool to have in your toolbox.
If you are not familiar with Python’s scientific libraries, the provided Jupyter note‐
books include a few tutorials. There is also a quick math tutorial for linear algebra.
Roadmap
This book is organized in two parts. Part I, The Fundamentals of Machine Learning,
covers the following topics:
• What is Machine Learning? What problems does it try to solve? What are the
main categories and fundamental concepts of Machine Learning systems?
• The main steps in a typical Machine Learning project.
• Learning by fitting a model to data.
• Optimizing a cost function.
• Handling, cleaning, and preparing data.
• Selecting and engineering features.
• Selecting a model and tuning hyperparameters using cross-validation.
• The main challenges of Machine Learning, in particular underfitting and overfit‐
ting (the bias/variance tradeoff).
• Reducing the dimensionality of the training data to fight the curse of dimension‐
ality.
• The most common learning algorithms: Linear and Polynomial Regression,
Logistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision
Trees, Random Forests, and Ensemble methods.
Preface | xv
Part II, Neural Networks and Deep Learning, covers the following topics:
The first part is based mostly on Scikit-Learn while the second part uses TensorFlow.
Don’t jump into deep waters too hastily: while Deep Learning is no
doubt one of the most exciting areas in Machine Learning, you
should master the fundamentals first. Moreover, most problems
can be solved quite well using simpler techniques such as Random
Forests and Ensemble methods (discussed in Part I). Deep Learn‐
ing is best suited for complex problems such as image recognition,
speech recognition, or natural language processing, provided you
have enough data, computing power, and patience.
Other Resources
Many resources are available to learn about Machine Learning. Andrew Ng’s ML
course on Coursera and Geoffrey Hinton’s course on neural networks and Deep
Learning are amazing, although they both require a significant time investment
(think months).
There are also many interesting websites about Machine Learning, including of
course Scikit-Learn’s exceptional User Guide. You may also enjoy Dataquest, which
provides very nice interactive tutorials, and ML blogs such as those listed on Quora.
Finally, the Deep Learning website has a good list of resources to learn more.
Of course there are also many other introductory books about Machine Learning, in
particular:
• Joel Grus, Data Science from Scratch (O’Reilly). This book presents the funda‐
mentals of Machine Learning, and implements some of the main algorithms in
pure Python (from scratch, as the name suggests).
• Stephen Marsland, Machine Learning: An Algorithmic Perspective (Chapman and
Hall). This book is a great introduction to Machine Learning, covering a wide
xvi | Preface
range of topics in depth, with code examples in Python (also from scratch, but
using NumPy).
• Sebastian Raschka, Python Machine Learning (Packt Publishing). Also a great
introduction to Machine Learning, this book leverages Python open source libra‐
ries (Pylearn 2 and Theano).
• Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, Learning from
Data (AMLBook). A rather theoretical approach to ML, this book provides deep
insights, in particular on the bias/variance tradeoff (see Chapter 4).
• Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd
Edition (Pearson). This is a great (and huge) book covering an incredible amount
of topics, including Machine Learning. It helps put ML into perspective.
Preface | xvii
This element signifies a general note.
O’Reilly Safari
Safari (formerly Safari Books Online) is a membership-based
training and reference platform for enterprise, government,
educators, and individuals.
Members have access to thousands of books, training videos, Learning Paths, interac‐
tive tutorials, and curated playlists from over 250 publishers, including O’Reilly
Media, Harvard Business Review, Prentice Hall Professional, Addison-Wesley Profes‐
sional, Microsoft Press, Sams, Que, Peachpit Press, Adobe, Focal Press, Cisco Press,
xviii | Preface
John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe
Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, and
Course Technology, among others.
For more information, please visit http://oreilly.com/safari.
How to Contact Us
Please address comments and questions concerning this book to the publisher:
We have a web page for this book, where we list errata, examples, and any additional
information. You can access this page at http://bit.ly/hands-on-machine-learning-
with-scikit-learn-and-tensorflow.
To comment or ask technical questions about this book, send email to bookques‐
[email protected].
For more information about our books, courses, conferences, and news, see our web‐
site at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
Acknowledgments
I would like to thank my Google colleagues, in particular the YouTube video classifi‐
cation team, for teaching me so much about Machine Learning. I could never have
started this project without them. Special thanks to my personal ML gurus: Clément
Courbet, Julien Dubois, Mathias Kende, Daniel Kitachewsky, James Pack, Alexander
Pak, Anosh Raj, Vitor Sessak, Wiktor Tomczak, Ingrid von Glehn, Rich Washington,
and everyone at YouTube Paris.
I am incredibly grateful to all the amazing people who took time out of their busy
lives to review my book in so much detail. Thanks to Pete Warden for answering all
my TensorFlow questions, reviewing Part II, providing many interesting insights, and
of course for being part of the core TensorFlow team. You should definitely check out
Preface | xix
his blog! Many thanks to Lukas Biewald for his very thorough review of Part II: he left
no stone unturned, tested all the code (and caught a few errors), made many great
suggestions, and his enthusiasm was contagious. You should check out his blog and
his cool robots! Thanks to Justin Francis, who also reviewed Part II very thoroughly,
catching errors and providing great insights, in particular in Chapter 16. Check out
his posts on TensorFlow!
Huge thanks as well to David Andrzejewski, who reviewed Part I and provided
incredibly useful feedback, identifying unclear sections and suggesting how to
improve them. Check out his website! Thanks to Grégoire Mesnil, who reviewed
Part II and contributed very interesting practical advice on training neural networks.
Thanks as well to Eddy Hung, Salim Sémaoune, Karim Matrah, Ingrid von Glehn,
Iain Smears, and Vincent Guilbeau for reviewing Part I and making many useful sug‐
gestions. And I also wish to thank my father-in-law, Michel Tessier, former mathe‐
matics teacher and now a great translator of Anton Chekhov, for helping me iron out
some of the mathematics and notations in this book and reviewing the linear algebra
Jupyter notebook.
And of course, a gigantic “thank you” to my dear brother Sylvain, who reviewed every
single chapter, tested every line of code, provided feedback on virtually every section,
and encouraged me from the first line to the last. Love you, bro!
Many thanks as well to O’Reilly’s fantastic staff, in particular Nicole Tache, who gave
me insightful feedback, always cheerful, encouraging, and helpful. Thanks as well to
Marie Beaugureau, Ben Lorica, Mike Loukides, and Laurel Ruma for believing in this
project and helping me define its scope. Thanks to Matt Hacker and all of the Atlas
team for answering all my technical questions regarding formatting, asciidoc, and
LaTeX, and thanks to Rachel Monaghan, Nick Adams, and all of the production team
for their final review and their hundreds of corrections.
Last but not least, I am infinitely grateful to my beloved wife, Emmanuelle, and to our
three wonderful kids, Alexandre, Rémi, and Gabrielle, for encouraging me to work
hard on this book, asking many questions (who said you can’t teach neural networks
to a seven-year-old?), and even bringing me cookies and coffee. What more can one
dream of?
xx | Preface
PART I
The Fundamentals of
Machine Learning
CHAPTER 1
The Machine Learning Landscape
When most people hear “Machine Learning,” they picture a robot: a dependable but‐
ler or a deadly Terminator depending on who you ask. But Machine Learning is not
just a futuristic fantasy, it’s already here. In fact, it has been around for decades in
some specialized applications, such as Optical Character Recognition (OCR). But the
first ML application that really became mainstream, improving the lives of hundreds
of millions of people, took over the world back in the 1990s: it was the spam filter.
Not exactly a self-aware Skynet, but it does technically qualify as Machine Learning
(it has actually learned so well that you seldom need to flag an email as spam any‐
more). It was followed by hundreds of ML applications that now quietly power hun‐
dreds of products and features that you use regularly, from better recommendations
to voice search.
Where does Machine Learning start and where does it end? What exactly does it
mean for a machine to learn something? If I download a copy of Wikipedia, has my
computer really “learned” something? Is it suddenly smarter? In this chapter we will
start by clarifying what Machine Learning is and why you may want to use it.
Then, before we set out to explore the Machine Learning continent, we will take a
look at the map and learn about the main regions and the most notable landmarks:
supervised versus unsupervised learning, online versus batch learning, instance-
based versus model-based learning. Then we will look at the workflow of a typical ML
project, discuss the main challenges you may face, and cover how to evaluate and
fine-tune a Machine Learning system.
This chapter introduces a lot of fundamental concepts (and jargon) that every data
scientist should know by heart. It will be a high-level overview (the only chapter
without much code), all rather simple, but you should make sure everything is
crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s
get started!
3
If you already know all the Machine Learning basics, you may want
to skip directly to Chapter 2. If you are not sure, try to answer all
the questions listed at the end of the chapter before moving on.
For example, your spam filter is a Machine Learning program that can learn to flag
spam given examples of spam emails (e.g., flagged by users) and examples of regular
(nonspam, also called “ham”) emails. The examples that the system uses to learn are
called the training set. Each training example is called a training instance (or sample).
In this case, the task T is to flag spam for new emails, the experience E is the training
data, and the performance measure P needs to be defined; for example, you can use
the ratio of correctly classified emails. This particular performance measure is called
accuracy and it is often used in classification tasks.
If you just download a copy of Wikipedia, your computer has a lot more data, but it is
not suddenly better at any task. Thus, it is not Machine Learning.
1. First you would look at what spam typically looks like. You might notice that
some words or phrases (such as “4U,” “credit card,” “free,” and “amazing”) tend to
come up a lot in the subject. Perhaps you would also notice a few other patterns
in the sender’s name, the email’s body, and so on.
Since the problem is not trivial, your program will likely become a long list of com‐
plex rules—pretty hard to maintain.
In contrast, a spam filter based on Machine Learning techniques automatically learns
which words and phrases are good predictors of spam by detecting unusually fre‐
quent patterns of words in the spam examples compared to the ham examples
(Figure 1-2). The program is much shorter, easier to maintain, and most likely more
accurate.
Another area where Machine Learning shines is for problems that either are too com‐
plex for traditional approaches or have no known algorithm. For example, consider
speech recognition: say you want to start simple and write a program capable of dis‐
tinguishing the words “one” and “two.” You might notice that the word “two” starts
with a high-pitch sound (“T”), so you could hardcode an algorithm that measures
high-pitch sound intensity and use that to distinguish ones and twos. Obviously this
technique will not scale to thousands of words spoken by millions of very different
people in noisy environments and in dozens of languages. The best solution (at least
today) is to write an algorithm that learns by itself, given many example recordings
for each word.
Finally, Machine Learning can help humans learn (Figure 1-4): ML algorithms can be
inspected to see what they have learned (although for some algorithms this can be
tricky). For instance, once the spam filter has been trained on enough spam, it can
easily be inspected to reveal the list of words and combinations of words that it
believes are the best predictors of spam. Sometimes this will reveal unsuspected cor‐
relations or new trends, and thereby lead to a better understanding of the problem.
Applying ML techniques to dig into large amounts of data can help discover patterns
that were not immediately apparent. This is called data mining.
• Problems for which existing solutions require a lot of hand-tuning or long lists of
rules: one Machine Learning algorithm can often simplify code and perform bet‐
ter.
• Complex problems for which there is no good solution at all using a traditional
approach: the best Machine Learning techniques can find a solution.
• Fluctuating environments: a Machine Learning system can adapt to new data.
• Getting insights about complex problems and large amounts of data.
• Whether or not they are trained with human supervision (supervised, unsuper‐
vised, semisupervised, and Reinforcement Learning)
• Whether or not they can learn incrementally on the fly (online versus batch
learning)
• Whether they work by simply comparing new data points to known data points,
or instead detect patterns in the training data and build a predictive model, much
like scientists do (instance-based versus model-based learning)
These criteria are not exclusive; you can combine them in any way you like. For
example, a state-of-the-art spam filter may learn on the fly using a deep neural net‐
Supervised/Unsupervised Learning
Machine Learning systems can be classified according to the amount and type of
supervision they get during training. There are four major categories: supervised
learning, unsupervised learning, semisupervised learning, and Reinforcement Learn‐
ing.
Supervised learning
In supervised learning, the training data you feed to the algorithm includes the desired
solutions, called labels (Figure 1-5).
Figure 1-5. A labeled training set for supervised learning (e.g., spam classification)
A typical supervised learning task is classification. The spam filter is a good example
of this: it is trained with many example emails along with their class (spam or ham),
and it must learn how to classify new emails.
Another typical task is to predict a target numeric value, such as the price of a car,
given a set of features (mileage, age, brand, etc.) called predictors. This sort of task is
called regression (Figure 1-6).1 To train the system, you need to give it many examples
of cars, including both their predictors and their labels (i.e., their prices).
1 Fun fact: this odd-sounding name is a statistics term introduced by Francis Galton while he was studying the
fact that the children of tall people tend to be shorter than their parents. Since children were shorter, he called
this regression to the mean. This name was then applied to the methods he used to analyze correlations
between variables.
Note that some regression algorithms can be used for classification as well, and vice
versa. For example, Logistic Regression is commonly used for classification, as it can
output a value that corresponds to the probability of belonging to a given class (e.g.,
20% chance of being spam).
Here are some of the most important supervised learning algorithms (covered in this
book):
• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks2
2 Some neural network architectures can be unsupervised, such as autoencoders and restricted Boltzmann
machines. They can also be semisupervised, such as in deep belief networks and unsupervised pretraining.
Here are some of the most important unsupervised learning algorithms (we will
cover dimensionality reduction in Chapter 8):
• Clustering
— k-Means
— Hierarchical Cluster Analysis (HCA)
— Expectation Maximization
• Visualization and dimensionality reduction
— Principal Component Analysis (PCA)
— Kernel PCA
— Locally-Linear Embedding (LLE)
— t-distributed Stochastic Neighbor Embedding (t-SNE)
• Association rule learning
— Apriori
— Eclat
For example, say you have a lot of data about your blog’s visitors. You may want to
run a clustering algorithm to try to detect groups of similar visitors (Figure 1-8). At
no point do you tell the algorithm which group a visitor belongs to: it finds those
connections without your help. For example, it might notice that 40% of your visitors
are males who love comic books and generally read your blog in the evening, while
20% are young sci-fi lovers who visit during the weekends, and so on. If you use a
hierarchical clustering algorithm, it may also subdivide each group into smaller
groups. This may help you target your posts for each group.
3 Notice how animals are rather well separated from vehicles, how horses are close to deer but far from birds,
and so on. Figure reproduced with permission from Socher, Ganjoo, Manning, and Ng (2013), “T-SNE visual‐
ization of the semantic word space.”
Finally, another common unsupervised task is association rule learning, in which the
goal is to dig into large amounts of data and discover interesting relations between
attributes. For example, suppose you own a supermarket. Running an association rule
on your sales logs may reveal that people who purchase barbecue sauce and potato
chips also tend to buy steak. Thus, you may want to place these items close to each
other.
Reinforcement Learning
Reinforcement Learning is a very different beast. The learning system, called an agent
in this context, can observe the environment, select and perform actions, and get
rewards in return (or penalties in the form of negative rewards, as in Figure 1-12). It
must then learn by itself what is the best strategy, called a policy, to get the most
reward over time. A policy defines what action the agent should choose when it is in a
given situation.
4 That’s when the system works perfectly. In practice it often creates a few clusters per person, and sometimes
mixes up two people who look alike, so you need to provide a few labels per person and manually clean up
some clusters.
Batch learning
In batch learning, the system is incapable of learning incrementally: it must be trained
using all the available data. This will generally take a lot of time and computing
resources, so it is typically done offline. First the system is trained, and then it is
launched into production and runs without learning anymore; it just applies what it
has learned. This is called offline learning.
If you want a batch learning system to know about new data (such as a new type of
spam), you need to train a new version of the system from scratch on the full dataset
(not just the new data, but also the old data), then stop the old system and replace it
with the new one.
Fortunately, the whole process of training, evaluating, and launching a Machine
Learning system can be automated fairly easily (as shown in Figure 1-3), so even a
Online learning
In online learning, you train the system incrementally by feeding it data instances
sequentially, either individually or by small groups called mini-batches. Each learning
step is fast and cheap, so the system can learn about new data on the fly, as it arrives
(see Figure 1-13).
Online learning is great for systems that receive data as a continuous flow (e.g., stock
prices) and need to adapt to change rapidly or autonomously. It is also a good option
This whole process is usually done offline (i.e., not on the live sys‐
tem), so online learning can be a confusing name. Think of it as
incremental learning.
One important parameter of online learning systems is how fast they should adapt to
changing data: this is called the learning rate. If you set a high learning rate, then your
system will rapidly adapt to new data, but it will also tend to quickly forget the old
data (you don’t want a spam filter to flag only the latest kinds of spam it was shown).
Conversely, if you set a low learning rate, the system will have more inertia; that is, it
will learn more slowly, but it will also be less sensitive to noise in the new data or to
sequences of nonrepresentative data points.
A big challenge with online learning is that if bad data is fed to the system, the sys‐
tem’s performance will gradually decline. If we are talking about a live system, your
clients will notice. For example, bad data could come from a malfunctioning sensor
on a robot, or from someone spamming a search engine to try to rank high in search
Instance-based learning
Possibly the most trivial form of learning is simply to learn by heart. If you were to
create a spam filter this way, it would just flag all emails that are identical to emails
that have already been flagged by users—not the worst solution, but certainly not the
best.
Instead of just flagging emails that are identical to known spam emails, your spam
filter could be programmed to also flag emails that are very similar to known spam
emails. This requires a measure of similarity between two emails. A (very basic) simi‐
larity measure between two emails could be to count the number of words they have
in common. The system would flag an email as spam if it has many words in com‐
mon with a known spam email.
This is called instance-based learning: the system learns the examples by heart, then
generalizes to new cases using a similarity measure (Figure 1-15).
For example, suppose you want to know if money makes people happy, so you down‐
load the Better Life Index data from the OECD’s website as well as stats about GDP
per capita from the IMF’s website. Then you join the tables and sort by GDP per cap‐
ita. Table 1-1 shows an excerpt of what you get.
Let’s plot the data for a few random countries (Figure 1-17).
There does seem to be a trend here! Although the data is noisy (i.e., partly random), it
looks like life satisfaction goes up more or less linearly as the country’s GDP per cap‐
ita increases. So you decide to model life satisfaction as a linear function of GDP per
capita. This step is called model selection: you selected a linear model of life satisfac‐
tion with just one attribute, GDP per capita (Equation 1-1).
This model has two model parameters, θ0 and θ1.5 By tweaking these parameters, you
can make your model represent any linear function, as shown in Figure 1-18.
5 By convention, the Greek letter θ (theta) is frequently used to represent model parameters.
Figure 1-19. The linear model that fits the training data best
You are finally ready to run the model to make predictions. For example, say you
want to know how happy Cypriots are, and the OECD data does not have the answer.
Fortunately, you can use your model to make a good prediction: you look up Cyprus’s
GDP per capita, find $22,587, and then apply your model and find that life satisfac‐
tion is likely to be somewhere around 4.85 + 22,587 × 4.91 × 10-5 = 5.96.
To whet your appetite, Example 1-1 shows the Python code that loads the data, pre‐
pares it,6 creates a scatterplot for visualization, and then trains a linear model and
makes a prediction.7
6 The code assumes that prepare_country_stats() is already defined: it merges the GDP and life satisfaction
data into a single Pandas dataframe.
7 It’s okay if you don’t understand all the code yet; we will present Scikit-Learn in the following chapters.
This is what a typical Machine Learning project looks like. In Chapter 2 you will
experience this first-hand by going through an end-to-end project.
We have covered a lot of ground so far: you now know what Machine Learning is
really about, why it is useful, what some of the most common categories of ML sys‐
tems are, and what a typical project workflow looks like. Now let’s look at what can go
wrong in learning and prevent you from making accurate predictions.
As the authors put it: “these results suggest that we may want to reconsider the trade-
off between spending time and money on algorithm development versus spending it
on corpus development.”
The idea that data matters more than algorithms for complex problems was further
popularized by Peter Norvig et al. in a paper titled “The Unreasonable Effectiveness
of Data” published in 2009.10 It should be noted, however, that small- and medium-
sized datasets are still very common, and it is not always easy or cheap to get extra
training data, so don’t abandon algorithms just yet.
8 For example, knowing whether to write “to,” “two,” or “too” depending on the context.
9 Figure reproduced with permission from Banko and Brill (2001), “Learning Curves for Confusion Set Disam‐
biguation.”
10 “The Unreasonable Effectiveness of Data,” Peter Norvig et al. (2009).
If you train a linear model on this data, you get the solid line, while the old model is
represented by the dotted line. As you can see, not only does adding a few missing
countries significantly alter the model, but it makes it clear that such a simple linear
model is probably never going to work well. It seems that very rich countries are not
happier than moderately rich countries (in fact they seem unhappier), and conversely
some poor countries seem happier than many rich countries.
By using a nonrepresentative training set, we trained a model that is unlikely to make
accurate predictions, especially for very poor and very rich countries.
It is crucial to use a training set that is representative of the cases you want to general‐
ize to. This is often harder than it sounds: if the sample is too small, you will have
sampling noise (i.e., nonrepresentative data as a result of chance), but even very large
samples can be nonrepresentative if the sampling method is flawed. This is called
sampling bias.
• First, to obtain the addresses to send the polls to, the Literary Digest used tele‐
phone directories, lists of magazine subscribers, club membership lists, and the
like. All of these lists tend to favor wealthier people, who are more likely to vote
Republican (hence Landon).
• Second, less than 25% of the people who received the poll answered. Again, this
introduces a sampling bias, by ruling out people who don’t care much about poli‐
tics, people who don’t like the Literary Digest, and other key groups. This is a spe‐
cial type of sampling bias called nonresponse bias.
Here is another example: say you want to build a system to recognize funk music vid‐
eos. One way to build your training set is to search “funk music” on YouTube and use
the resulting videos. But this assumes that YouTube’s search engine returns a set of
videos that are representative of all the funk music videos on YouTube. In reality, the
search results are likely to be biased toward popular artists (and if you live in Brazil
you will get a lot of “funk carioca” videos, which sound nothing like James Brown).
On the other hand, how else can you get a large training set?
Poor-Quality Data
Obviously, if your training data is full of errors, outliers, and noise (e.g., due to poor-
quality measurements), it will make it harder for the system to detect the underlying
patterns, so your system is less likely to perform well. It is often well worth the effort
to spend time cleaning up your training data. The truth is, most data scientists spend
a significant part of their time doing just that. For example:
• If some instances are clearly outliers, it may help to simply discard them or try to
fix the errors manually.
• If some instances are missing a few features (e.g., 5% of your customers did not
specify their age), you must decide whether you want to ignore this attribute alto‐
gether, ignore these instances, fill in the missing values (e.g., with the median
age), or train one model with the feature and one model without it, and so on.
Irrelevant Features
As the saying goes: garbage in, garbage out. Your system will only be capable of learn‐
ing if the training data contains enough relevant features and not too many irrelevant
ones. A critical part of the success of a Machine Learning project is coming up with a
good set of features to train on. This process, called feature engineering, involves:
Now that we have looked at many examples of bad data, let’s look at a couple of exam‐
ples of bad algorithms.
Complex models such as deep neural networks can detect subtle patterns in the data,
but if the training set is noisy, or if it is too small (which introduces sampling noise),
then the model is likely to detect patterns in the noise itself. Obviously these patterns
will not generalize to new instances. For example, say you feed your life satisfaction
model many more attributes, including uninformative ones such as the country’s
name. In that case, a complex model may detect patterns like the fact that all coun‐
tries in the training data with a w in their name have a life satisfaction greater than 7:
New Zealand (7.3), Norway (7.4), Sweden (7.2), and Switzerland (7.5). How confident
Constraining a model to make it simpler and reduce the risk of overfitting is called
regularization. For example, the linear model we defined earlier has two parameters,
θ0 and θ1. This gives the learning algorithm two degrees of freedom to adapt the model
to the training data: it can tweak both the height (θ0) and the slope (θ1) of the line. If
we forced θ1 = 0, the algorithm would have only one degree of freedom and would
have a much harder time fitting the data properly: all it could do is move the line up
or down to get as close as possible to the training instances, so it would end up
around the mean. A very simple model indeed! If we allow the algorithm to modify θ1
but we force it to keep it small, then the learning algorithm will effectively have some‐
where in between one and two degrees of freedom. It will produce a simpler model
than with two degrees of freedom, but more complex than with just one. You want to
find the right balance between fitting the data perfectly and keeping the model simple
enough to ensure that it will generalize well.
Figure 1-23 shows three models: the dotted line represents the original model that
was trained with a few countries missing, the dashed line is our second model trained
with all countries, and the solid line is a linear model trained with the same data as
the first model but with a regularization constraint. You can see that regularization
forced the model to have a smaller slope, which fits a bit less the training data that the
model was trained on, but actually allows it to generalize better to new examples.
Stepping Back
By now you already know a lot about Machine Learning. However, we went through
so many concepts that you may be feeling a little lost, so let’s step back and look at the
big picture:
While this strange scene was being witnessed, Colonel Loch and
Captain Speedy were manœuvring at the extremity of Selasse, on
the road which encircled the fortress and thence led to Magdala.
Looking up to the heights the British officers saw a number of men
careering about on the plateau which connected Selasse with
Magdala. It was ascertained that they belonged to the enemy, and
their dress indicated that they were chiefs. When these men saw the
cavalry advancing round the corner at Selasse they retired slowly
and in good order to Magdala, firing as they went.
A few shells were now sent whizzing amongst the Abyssinians, who
had by this time commenced a desultory firing. Very soon, growing
alarmed at the work of our artillery, the Abyssinians retired for
shelter behind some wooden booths. A few more shells, however,
soon dislodged Theodore and his men from their hiding places, and
they beat a rapid retreat towards Magdala. Still they had not
finished, and continued to fire at all who came within reach of their
mountain stronghold. Their persistent firing ultimately lured a
detachment of the 33rd Foot into action, but without marked effect,
and shortly after this orders came from Sir Charles Staveley to cease
firing. At the same time the British flag was hoisted above Selasse
and Fahla. Only Magdala now remained.
Then Napier distributed his force in preparation for the attack. Soon
twenty guns were thundering at the gates. Theodore could not
misunderstand the meaning of the British now. It was surrender or
death for him and his followers.
The bombardment lasted two hours. At the end of this period Napier
had made up his mind that the defenders were weak, and that the
British troops would suffer very little loss in the assault. He therefore
ordered the Royal Engineers, the 33rd, the 45th, and the King’s Own
to be prepared to carry on the attack. Already the fire from the
fortress had ceased Soon signals for rapid firing were given to the
British artillery, and under the furious cannonade which proceeded,
the British troops began their march along the plateau.
Upon their arrival within fifty yards of the foot of Magdala, the order
was given to the artillery to cease fire. Then the Engineers at once
brought their sniders into play, and for ten minutes they and the
33rd and 45th rained a storm of leaden pellets upon the defenders.
Theodore and his brave followers had been concealed while the
artillery was at work. Now, however, the king showed himself. Up he
sprang, singing out his war-cry, and with his bodyguard he hastened
to the gates, prepared to give the invaders a fitting welcome. He
posted his men at the loopholes and along the wall, topped with
wattled hurdles. Soon his signal was given, and heavy firing was
directed upon the advancing soldiers, several of whom were
wounded. Next the British fire was concentrated on the barbican,
and the revetment, through the loopholes of which rays of smoke
issuing forth betrayed the presence of the enemy. Slowly the soldiers
advanced through the rain which accompanied the thunderstorm
which now raged. For a minute there was a pause, and then again a
dozen bullets hurtled through the advance guard of the troops,
wounding Major Pritchard and several of the Engineers. Then Major
Pritchard and Lieutenant Morgan made a dash upon the barbican.
They found the gate closed, and the inside of the square completely
blocked up with huge stones.
The soldiers were in the presence of the Emperor, and he was dying.
Soon the rest of the troops followed their leaders, and the British
flag was straining from the post which crowned the summit of the
Abyssinian stronghold. Then, while the sound of “God Save the
Queen” rent the once more peaceful air, and the soldiers of the
Queen joined lustily in the triumphant cheers, the once proud
Emperor of Abyssinia, in all the gorgeous trappings of his state, and
surrounded by a crowd of interested spectators, breathed his last in
the stronghold where he had thought to give pause to those he
regarded as the enemies of his kingdom.
Soon after “the Advance” was once more sounded, and the soldiers
filed in column through the narrow streets, the commander-in-chief
and staff following.
When the cost of the assault came to be reckoned, it was found that
17 British had been wounded, though none of them mortally. The
Abyssinian dead were estimated at 60, with double that number of
wounded.
On the 18th April, 1868, the troops turned their faces northward for
their homeward march, their object fully attained.
CHAPTER LVII.
THE BATTLES OF AMOAFUL AND
ORDASHU.
1874.
With the march of time, Britain extended and strengthened her hold
upon the settlement, and ultimately, pursuing this policy, brought
out the Danes, and made exchanges with the Dutch there. These
proceedings culminated in Britain becoming possessors of the whole
of the territory formerly under Dutch protection. The taking over of
the Dutch forts caused heart-burning among the Ashantees.
Particularly was this the case with regard to Elimina, where, at the
time the negotiations for the transfer were being considered, a
number of Ashantee troops were lying.
King Koffee Kalkali, the ruler of the Ashantees, protested against the
transfer, maintaining that the Dutch had no right to hand over the
territory to Britain, as it belonged to him. Notwithstanding, the
Dutch contrived to get rid of the truculent Koffee and his followers
then stationed at Elimina.
Not only did the Ashantees resent the Anglo-Dutch agreement, but
other tribes in several instances also took objection. This especially
was the case as regarded the Fanties and Eliminas, who hated each
other, and interchanged hostile acts, although by this time both were
under one common protection.
The old hatred of Britain had been awakened. King Koffee assumed
a dominant and aggressive spirit, and became bent on invasion. To
some extent he was abetted by the Eliminas, who, in part at any
rate, were disloyal to the whites. From these causes arose the
campaign of ’73-’74 and the battles of Amoaful and Ordashu.
At home, the Government was slow to act, and not until repeated
application had been made for white troops was the appeal given
heed to.
His instructions, briefly, were to drive the Ashantees back over the
Prah, then to follow and punish them until they should consent to be
peaceful, should release their prisoners, and comply with terms
necessary to our own interests and those of humanity.
The deadly nature of the coast, “the white man’s grave,” was
doubtless a potent factor with the Government in that they did not
immediately acquiesce with Sir Garnet’s request for white troops.
But, as we know, the Government at last acceded, and the
regiments selected for service in that disease-pregnated country
have added lustre to their fame and also another page of glorious
history to the story of the pluck and endurance of Britain’s soldiers.
The total number of troops under the command of Sir Garnet
Wolseley being made up of Colonel Wood’s native regiment of 400
men, Major Russell’s native regiment of 400, the 42nd Highlanders
(Black Watch) 575 strong, the Rifle Brigade 650, 75 men of the 23rd
Fusiliers, Royal Naval Brigade 225, 2nd West India Regiment 350,
Royal Engineers 40, and Rait’s artillery 50.
About the end of October, 1873, Sir Garnet Wolseley began his
forward march into the interior. There was fighting to be done ere
long, for the enemy made an attempt to arrest the progress of the
troops by besieging Abrakrampa, the chief town of the province of
Abra, of which the native king was Britain’s staunch ally. A three
days’ ineffectual leaguer ensued, during which the Ashantees lost
heavily, while not so much as one white man was injured. With Sir
Garnet close behind, the Ashantees thought it best to recross the
Prah and retreat towards Coomassie.
Through the dense bush the troops marched in the garish and
dazzling sunlight, and at the end of their daily tramp through the
hostile country they were glad to lie down and rest in the huts
provided for them. In the way of rations the men were well looked
after by the commissariat department, the fare being as follows:—
One and a half pounds of meat, salt or fresh, one pound of pressed
meat, one and a quarter pounds of biscuits, four ounces of pressed
vegetables, two ounces of rice or preserved peas, three ounces of
sugar, three-quarters of an ounce of tea, half an ounce of salt, one-
thirteenth of an ounce of pepper. With such substantial and varied
feeding the hardships of the march were minimised and weakness
was rare—another striking illustration of the truth of the maxim of
the great Napoleon that “an army goes upon its belly.”
The further the British force progressed, denser and loftier grew the
forest, although the Engineers with unflagging energy had cleared a
pathway as far as the Prah. On the 15th December, 1873, Sir Garnet
Wolseley was able to report “the first phase of the war had been
brought to a satisfactory conclusion by a few companies of the 2nd
West India regiment, Rait’s artillery, Gordon’s Houssas, and Wood’s
and Russell’s regiments, admirably conducted by the British officers
belonging to them, without the assistance of any other troops except
the marines and blue-jackets who were upon the station on his
arrival.”
Sir Garnet arrived at Prashu on the 2nd January, 1874, and was
joyfully received by the assembled soldiers. Early in the same
morning an Ashantee embassy was espied on the other side of the
Prah. These ambassadors brought a letter from the truculent King
Koffee, in which the wily savage had the audacity to point out that
the attack upon him was unjustifiable.
For this final move Sir Garnet was prepared. In his notes for the use
of his army the commander says:—
With these and similar heartening instructions, the coming fight was
anticipated eagerly by our troops, the Fanties alone, who were
employed as transport bearers, proving unreliable. These latter
deserted in thousands, thus throwing extra work upon the white
troops, many of the regiments having to carry their own baggage.
The troops were early on the move, and with precision they filed into
their allotted places. Led by Brigadier Sir Archibald Alison, the front
column was comprised of the famous Black Watch, eighty men of
the 23rd Fusiliers, Rait’s artillery, two small rifled guns manned by
Houssas, and two rocket troughs, with a detachment of the Royal
Engineers. The left column was under the command of Brigadier
McLeod, of the Black Watch, and contained half of the blue-jackets,
Russell’s native troops, two rocket troughs, and Royal Engineers.
Lieutenant-Colonel Wood, V.C., of the Perthshire Light Infantry, had
charge of the right column, which consisted of the remaining half of
the naval brigade, seamen and marines, detachments of the Royal
Engineers, and artillery, with rockets and a regiment of African
levies. The rear column was made up of the second battalion of the
Rifle Brigade, 580 strong, and the entire force was under the skilful
command of Sir Garnet Wolseley.
Under a heavy fire, the left column were struggling to oust the
enemy. There, while urging on his men, the gallant Captain Buckle,
R.E., was mortally wounded, having been hit by two slugs in the
region of the heart.
The right column were also soon hotly engaged, and so dense was
the jungle between it and the main road that the men, in firing, had
the greatest difficulty to avoid hitting their comrades of the Black
Watch.
Mr. Henty, regarding this, says:—“Anxious to see the nature of the
difficulties with which the troops were contending, I went out to the
right column, and found the naval brigade lying down and firing into
a dense bush, from which, in spite of their heavy firing, answering
discharges came incessantly, at a distance of some twenty yards or
so. The air above was literally alive with slugs, and a perfect shower
of leaves continued to fall upon the earth. The sailors complained
that either the 23rd or 42nd were firing at them, and the same
complaint was made against the naval brigade by the 42nd and
23rd. No doubt there was, at times, justice in these complaints, for
the bush was so bewilderingly dense that men soon lost all idea of
the points of the compass, and fired in any direction from which
shots came.”
This was about one o’clock in the afternoon, and the Rifles
succeeded in repulsing the natives. It will thus be seen that on all
sides of the square the Ashantees had tried to break through. For
more than an hour they maintained the attack, but the resistance
offered completely set their attempts at nought. The climax came
when Sir Garnet, observing that the Ashantee fire was slackening,
gave orders for the line to advance, and to wheel round, so as to
drive the enemy northwards before it.
The movement was splendidly carried out. The wild Kosses and
Bonnymen of Wood’s regiment, cannibals, who had fought steadily
and silently so long as they had been on the defensive, now raised
their shrill war-cry, slung their rifles, drew their cutlasses, and like so
many wild beasts, dashed into the bush to close with the enemy,
while the Rifles, quietly and in an orderly manner as if upon parade,
went on in extended order, scouring every bush with their bullets,
and in five minutes from the time the “Advance” sounded, the
Ashantees were in full and final retreat. Even then the enemy were
not inclined to take their beating without protest, and for several
hours continued to harass the troops by sudden but abortive rushes.
The British loss was over 200 officers and men killed and wounded,
the Black Watch suffering most heavily, having one officer killed, and
7 officers and 104 men wounded. In his despatch Sir Garnet said:—
Hard fighting, however, was not yet at an end, and on the day
following the rout at Amoaful, February 1st, the Ashantees made a
stand at Becquah, an important town standing a short distance from
the line of communication, and which would undoubtedly have been
the cause of considerable trouble and loss of life had the General
moved directly north without causing the place to be destroyed.
Only about a mile separated the camp from Becquah, and the force
creeping silently upon the village, soon engaged with the enemy.
Sharp firing took place, and the natives, unable to withstand the
assault, turned tail and fled. The men of the naval brigade were the
first to enter the place, and soon the huts were a mass of flames.
Some native accoutrements and much corn fell into our hands.
Following this, several villages which lay between Amoaful and
Coomassie were taken with comparatively little fighting, the
Ashantees having evidently taken much to heart the severe loss
inflicted on them on 31st January. Each village passed through had
its human sacrifice lying in the middle of the path, for the purpose of
affrighting the conquerors.
The spectacle was sickening, and the wanton cruelty made the
victorious troops even more determined and anxious to put an end
to these frightful barbarities.
The corpses lay thick on the roadside, while the bush was littered
with dead and dying. Sir Garnet rushed the whole of the army
through Ordashu, and then, without loss of time, “the Forty-Twa”
were again in the van, heading towards Coomassie, a sufficient force
having been left to guard Ordashu.
The kingdom of Zululand in 1873 lay, as all are aware, between the
British colony of Natal on the south and the Transvaal Republic on
the north. Now, while the Natal border had always been in a state of
quiet and peacefulness, and the nearer settlers were on friendly
terms with their Zulu neighbours, the northern border of the
kingdom was in a constant state of unrest. For one thing, the
Transvaal Boers were, upon one pretext and another, constantly
encroaching in a southerly direction on the confines of Zululand; for
another, they were in the habit of treating the Zulus and other tribes
with an unpardonable severity.
As late as 1876 the Zulu people begged that the Governor of Natal
“will take a strip of the country, the length and breadth of which is to
be agreed upon between the Zulus and the Commissioners (for
whom they ask) sent from Natal, the strip to abut on the colony of
Natal and to run to the northward and eastward in such a manner as
to interpose all its length between the Boers and the Zulus, and to
be governed by the colony of Natal.”
The reasons given for the issue of the ultimatum were three in
particular. The first had reference to the affair of Sihayo. On July 28,
1878, a wife of the chief Sihayo, an under-chief of Cetewayo’s, had
left her husband and escaped into Natal. Hither she was followed by
Sihayo’s two chief sons and brother, conveyed back to Zululand, and
there put to death in accordance with the native custom for such an
offence. These culprits the Natal Government now demanded should
be given up to be tried in the Natal courts. Cetewayo, however, did
not regard the offence as a serious one, and offered money
compensation in place of the surrender of the young men, “looking
upon the whole affair as the act of rash boys, who, in their zeal for
their father’s honour, did not think what they were doing.”
The demand for the person of the Swazi chief, Umbilini, formed the
second point. This chief, a Swazi, was not under the jurisdiction of
Cetewayo, and though he was charged, and had been frequently
convicted of raiding, Cetewayo was in no way responsible for his
acts, otherwise than as an over-lord.
On the 11th January, 1879, the allotted period having expired, war
was declared.
“The British forces,” ran the document, “are crossing into Zululand to
exact from Cetewayo reparation for violations of British territory
committed by the sons of Sihayo and others,” and to enforce better
government of his people. “All who lay down their arms will be
provided for, ... and when the war is finished the British Government
will make the best arrangements in its power for the future good
government of the Zulus.”
Colonel Glyn commanded the 3rd Column, and Rorke’s Drift was the
point selected for the crossing of this body of troops. It consisted of
six guns of the Royal Artillery, one squadron of mounted infantry, the
24th regiment, 200 Natal volunteers, 150 mounted police, the
second battalion of the 3rd regiment, with pioneers, native
contingent, and a company of Royal Engineers.
No. 4 Column, under Colonel (afterwards Sir Evelyn) Wood, V.C., was
to advance on the Blood River. Its strength was made up of Royal
Artillery, the 13th regiment, 90th regiment, frontier light horse, and
200 of the native contingent.
In addition to the four columns, a fifth, under Colonel Rowlands,
composed of the 80th regiment and mounted irregulars, was
available. The total fighting force numbered some 7000 British and
9000 native troops—16,000 in all, with drivers. The Zulu army was
estimated at not less than 40,000 strong.
Slowly the Zulus began to work round to the rear of the British
camp, and very shortly the 24th regiment found themselves
surrounded. At this point the camp followers and native troops fled
as best they could, the Zulus killing with the assegai all they could
lay hands on. In a little while the British were entirely overwhelmed.
Says Miss Colenso:—“After this period (1.30 p.m.) no one living
escaped from Isandhlwana, and it is supposed the troops had
broken, and falling into confusion, all had perished after a brief
struggle.”
One bright incident alone stands out distinctly on this fatal 22nd
January. On the storming of the camp by the Zulus, Lieutenants
Melville and Coghill rode from the camp with the colours of their
regiment. On they spurred in their frantic flight to the Tugela, and
Coghill safely stemmed the torrent and landed on the farther shore.
Melville, however, while in mid stream, lost his horse, but clinging to
the beloved colours, battled with the furious torrent with all the
energy of despair. The Zulus pressed upon them. Quick as thought,
Coghill put his charger once more into the current, and struggled to
the assistance of his brother officer, and, despite the fact that a Zulu
bullet made short work of his horse, the two devoted men
succeeded in making their escape with the colours still in their
hands. The respite was not for long, however. Soon the yelling
hordes were upon them, and, fighting fiercely to the last,
Lieutenants Melville and Coghill died bravely upholding the honour of
their country.
Meantime the advance party had pushed forward, and came in touch
now and again with the enemy, who ever fell back before them, till
about midday, when it was determined to return to camp. About this
time word came to hand of heavy firing near the camp, and
returning gradually till about six o’clock, when at a distance of only
two miles from the waggons, “four men were observed slowly
advancing towards the returning force. Thinking them to be enemy,
fire was opened, and one of the men fell. The others ran into the
open, holding up their hands, to show themselves unarmed.” They
proved to be the only survivors of the native contingent. “The camp
was found tenanted by those who were taking their last long sleep.”
The night of the 22nd January saw another historic incident of the
war—the heroic defence of Rorke’s Drift. At this important ford of
the Tugela, vital to the British lines of communication, were
stationed Lieutenants Chard and Bromhead, and B company, 2nd
battalion 24th regiment. One hundred and thirty-nine men in all
constituted the numbers of this devoted band. A mission station, one
building of which was used as a hospital, and one as a commissariat
store, made up Rorke’s Drift.
At 3.15 p.m. (the time has been noted with great accuracy),
Lieutenant Chard, who was down by the river, heard the sound of
furious galloping. Louder and louder grew the hoof-beats, and ere
long two spent and almost beaten horsemen drew sudden rein upon
the Zulu bank of the Tugela. Wildly they demanded to be ferried
across, and in a few frenzied words told the terrible tale of
Isandhlwana. The Zulus were coming, they cried, and not a moment
was to be lost!
From now on, the defence of Rorke’s Drift became one prolonged
and watchful struggle. Again and again the frenzied Zulus threw
themselves against the slender defences of the gallant band, and
again and again were they hurled back, now with rifle fire, now with
bayonet, but ever backward. Darkness set in, and still the rushes
continued, till at length it was found necessary to retreat into the
inner line of defence composed of the biscuit-boxes aforementioned.
At length the enemy succeeded in setting the hospital on fire, and
the awful task of removing the sick, under the fearful odds, was
taken in hand. Alas! not all could be removed, and many perished.
No effort, however, was spared to get them all out, and at the last,
with ammunition all expended, Privates Williams, Hook, R. Jones,
and W. Jones held the door with the bayonet against the Zulu horde.
ebookball.com