Full Hands On Machine Learning With Scikit Learn and TensorFlow Aurélien Géron Ebook All Chapters
Full Hands On Machine Learning With Scikit Learn and TensorFlow Aurélien Géron Ebook All Chapters
com
https://textbookfull.com/product/hands-on-machine-
learning-with-scikit-learn-and-tensorflow-
aurelien-geron/
https://textbookfull.com/product/hands-on-machine-learning-with-
scikit-learn-and-tensorflow-1st-edition-aurelien-geron/
textbookfull.com
https://textbookfull.com/product/hands-on-machine-learning-with-
scikit-learn-and-tensorflow-early-release-2nd-edition-aurelien-geron/
textbookfull.com
https://textbookfull.com/product/the-future-of-risk-management-volume-
i-perspectives-on-law-healthcare-and-the-environment-paola-de-
vincentiis/
textbookfull.com
Integrated History and Philosophy of Science Problems
Perspectives and Case Studies 1st Edition Friedrich
Stadler (Eds.)
https://textbookfull.com/product/integrated-history-and-philosophy-of-
science-problems-perspectives-and-case-studies-1st-edition-friedrich-
stadler-eds/
textbookfull.com
https://textbookfull.com/product/digitalization-in-construction-1st-
edition-angenette-spalink/
textbookfull.com
https://textbookfull.com/product/the-european-convention-on-human-
rights-a-commentary-1st-edition-william-a-schabas/
textbookfull.com
https://textbookfull.com/product/core-competencies-of-relational-
psychoanalysis-a-guide-to-practice-study-and-research-1st-edition-roy-
e-barsness/
textbookfull.com
https://textbookfull.com/product/heterogeneous-computing-hardware-
software-perspectives-1st-edition-mohamed-zahran/
textbookfull.com
Introduction to Computation and Programming Using Python
With Application to Understanding Data Second Edition John
V. Guttag
https://textbookfull.com/product/introduction-to-computation-and-
programming-using-python-with-application-to-understanding-data-
second-edition-john-v-guttag/
textbookfull.com
Hands-On
Machine Learning
with Scikit-Learn
& TensorFlow
CONCEPTS, TOOLS, AND TECHNIQUES
TO BUILD INTELLIGENT SYSTEMS
Aurélien Géron
Hands-On Machine Learning with
Scikit-Learn and TensorFlow
Concepts, Tools, and Techniques to
Build Intelligent Systems
Aurélien Géron
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Hands-On Machine Learning with
Scikit-Learn and TensorFlow, the cover image, and related trade dress are trademarks of O’Reilly Media,
Inc.
While the publisher and the author have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the author disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
978-1-491-96229-9
[LSI]
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
iii
Check the Assumptions 40
Get the Data 40
Create the Workspace 40
Download the Data 43
Take a Quick Look at the Data Structure 45
Create a Test Set 49
Discover and Visualize the Data to Gain Insights 53
Visualizing Geographical Data 53
Looking for Correlations 55
Experimenting with Attribute Combinations 58
Prepare the Data for Machine Learning Algorithms 59
Data Cleaning 60
Handling Text and Categorical Attributes 62
Custom Transformers 64
Feature Scaling 65
Transformation Pipelines 66
Select and Train a Model 68
Training and Evaluating on the Training Set 68
Better Evaluation Using Cross-Validation 69
Fine-Tune Your Model 71
Grid Search 72
Randomized Search 74
Ensemble Methods 74
Analyze the Best Models and Their Errors 74
Evaluate Your System on the Test Set 75
Launch, Monitor, and Maintain Your System 76
Try It Out! 77
Exercises 77
3. Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
MNIST 79
Training a Binary Classifier 82
Performance Measures 82
Measuring Accuracy Using Cross-Validation 83
Confusion Matrix 84
Precision and Recall 86
Precision/Recall Tradeoff 87
The ROC Curve 91
Multiclass Classification 93
Error Analysis 96
Multilabel Classification 100
Multioutput Classification 101
iv | Table of Contents
Exercises 102
Table of Contents | v
6. Decision Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Training and Visualizing a Decision Tree 167
Making Predictions 169
Estimating Class Probabilities 171
The CART Training Algorithm 171
Computational Complexity 172
Gini Impurity or Entropy? 172
Regularization Hyperparameters 173
Regression 175
Instability 177
Exercises 178
vi | Table of Contents
Kernel PCA 218
Selecting a Kernel and Tuning Hyperparameters 219
LLE 221
Other Dimensionality Reduction Techniques 223
Exercises 224
Table of Contents | ix
Distributing a Deep RNN Across Multiple GPUs 397
Applying Dropout 399
The Difficulty of Training over Many Time Steps 400
LSTM Cell 401
Peephole Connections 403
GRU Cell 404
Natural Language Processing 405
Word Embeddings 405
An Encoder–Decoder Network for Machine Translation 407
Exercises 410
x | Table of Contents
Exercises 469
Thank You! 470
D. Autodiff. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Table of Contents | xi
Preface
xiii
Or maybe your company has tons of data (user logs, financial data, production data,
machine sensor data, hotline stats, HR reports, etc.), and more than likely you could
unearth some hidden gems if you just knew where to look; for example:
• Segment customers and find the best marketing strategy for each group
• Recommend products for each client based on what similar clients bought
• Detect which transactions are likely to be fraudulent
• Predict next year’s revenue
• And more
Whatever the reason, you have decided to learn Machine Learning and implement it
in your projects. Great idea!
• Scikit-Learn is very easy to use, yet it implements many Machine Learning algo‐
rithms efficiently, so it makes for a great entry point to learn Machine Learning.
• TensorFlow is a more complex library for distributed numerical computation
using data flow graphs. It makes it possible to train and run very large neural net‐
works efficiently by distributing the computations across potentially thousands
of multi-GPU servers. TensorFlow was created at Google and supports many of
their large-scale Machine Learning applications. It was open-sourced in Novem‐
ber 2015.
xiv | Preface
Prerequisites
This book assumes that you have some Python programming experience and that you
are familiar with Python’s main scientific libraries, in particular NumPy, Pandas, and
Matplotlib.
Also, if you care about what’s under the hood you should have a reasonable under‐
standing of college-level math as well (calculus, linear algebra, probabilities, and sta‐
tistics).
If you don’t know Python yet, http://learnpython.org/ is a great place to start. The offi‐
cial tutorial on python.org is also quite good.
If you have never used Jupyter, Chapter 2 will guide you through installation and the
basics: it is a great tool to have in your toolbox.
If you are not familiar with Python’s scientific libraries, the provided Jupyter note‐
books include a few tutorials. There is also a quick math tutorial for linear algebra.
Roadmap
This book is organized in two parts. Part I, The Fundamentals of Machine Learning,
covers the following topics:
• What is Machine Learning? What problems does it try to solve? What are the
main categories and fundamental concepts of Machine Learning systems?
• The main steps in a typical Machine Learning project.
• Learning by fitting a model to data.
• Optimizing a cost function.
• Handling, cleaning, and preparing data.
• Selecting and engineering features.
• Selecting a model and tuning hyperparameters using cross-validation.
• The main challenges of Machine Learning, in particular underfitting and overfit‐
ting (the bias/variance tradeoff).
• Reducing the dimensionality of the training data to fight the curse of dimension‐
ality.
• The most common learning algorithms: Linear and Polynomial Regression,
Logistic Regression, k-Nearest Neighbors, Support Vector Machines, Decision
Trees, Random Forests, and Ensemble methods.
Preface | xv
Part II, Neural Networks and Deep Learning, covers the following topics:
The first part is based mostly on Scikit-Learn while the second part uses TensorFlow.
Don’t jump into deep waters too hastily: while Deep Learning is no
doubt one of the most exciting areas in Machine Learning, you
should master the fundamentals first. Moreover, most problems
can be solved quite well using simpler techniques such as Random
Forests and Ensemble methods (discussed in Part I). Deep Learn‐
ing is best suited for complex problems such as image recognition,
speech recognition, or natural language processing, provided you
have enough data, computing power, and patience.
Other Resources
Many resources are available to learn about Machine Learning. Andrew Ng’s ML
course on Coursera and Geoffrey Hinton’s course on neural networks and Deep
Learning are amazing, although they both require a significant time investment
(think months).
There are also many interesting websites about Machine Learning, including of
course Scikit-Learn’s exceptional User Guide. You may also enjoy Dataquest, which
provides very nice interactive tutorials, and ML blogs such as those listed on Quora.
Finally, the Deep Learning website has a good list of resources to learn more.
Of course there are also many other introductory books about Machine Learning, in
particular:
• Joel Grus, Data Science from Scratch (O’Reilly). This book presents the funda‐
mentals of Machine Learning, and implements some of the main algorithms in
pure Python (from scratch, as the name suggests).
• Stephen Marsland, Machine Learning: An Algorithmic Perspective (Chapman and
Hall). This book is a great introduction to Machine Learning, covering a wide
xvi | Preface
range of topics in depth, with code examples in Python (also from scratch, but
using NumPy).
• Sebastian Raschka, Python Machine Learning (Packt Publishing). Also a great
introduction to Machine Learning, this book leverages Python open source libra‐
ries (Pylearn 2 and Theano).
• Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, Learning from
Data (AMLBook). A rather theoretical approach to ML, this book provides deep
insights, in particular on the bias/variance tradeoff (see Chapter 4).
• Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd
Edition (Pearson). This is a great (and huge) book covering an incredible amount
of topics, including Machine Learning. It helps put ML into perspective.
Preface | xvii
This element signifies a general note.
O’Reilly Safari
Safari (formerly Safari Books Online) is a membership-based
training and reference platform for enterprise, government,
educators, and individuals.
Members have access to thousands of books, training videos, Learning Paths, interac‐
tive tutorials, and curated playlists from over 250 publishers, including O’Reilly
Media, Harvard Business Review, Prentice Hall Professional, Addison-Wesley Profes‐
sional, Microsoft Press, Sams, Que, Peachpit Press, Adobe, Focal Press, Cisco Press,
xviii | Preface
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe
Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, and
Course Technology, among others.
For more information, please visit http://oreilly.com/safari.
How to Contact Us
Please address comments and questions concerning this book to the publisher:
We have a web page for this book, where we list errata, examples, and any additional
information. You can access this page at http://bit.ly/hands-on-machine-learning-
with-scikit-learn-and-tensorflow.
To comment or ask technical questions about this book, send email to bookques‐
[email protected].
For more information about our books, courses, conferences, and news, see our web‐
site at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
Acknowledgments
I would like to thank my Google colleagues, in particular the YouTube video classifi‐
cation team, for teaching me so much about Machine Learning. I could never have
started this project without them. Special thanks to my personal ML gurus: Clément
Courbet, Julien Dubois, Mathias Kende, Daniel Kitachewsky, James Pack, Alexander
Pak, Anosh Raj, Vitor Sessak, Wiktor Tomczak, Ingrid von Glehn, Rich Washington,
and everyone at YouTube Paris.
I am incredibly grateful to all the amazing people who took time out of their busy
lives to review my book in so much detail. Thanks to Pete Warden for answering all
my TensorFlow questions, reviewing Part II, providing many interesting insights, and
of course for being part of the core TensorFlow team. You should definitely check out
Preface | xix
his blog! Many thanks to Lukas Biewald for his very thorough review of Part II: he left
no stone unturned, tested all the code (and caught a few errors), made many great
suggestions, and his enthusiasm was contagious. You should check out his blog and
his cool robots! Thanks to Justin Francis, who also reviewed Part II very thoroughly,
catching errors and providing great insights, in particular in Chapter 16. Check out
his posts on TensorFlow!
Huge thanks as well to David Andrzejewski, who reviewed Part I and provided
incredibly useful feedback, identifying unclear sections and suggesting how to
improve them. Check out his website! Thanks to Grégoire Mesnil, who reviewed
Part II and contributed very interesting practical advice on training neural networks.
Thanks as well to Eddy Hung, Salim Sémaoune, Karim Matrah, Ingrid von Glehn,
Iain Smears, and Vincent Guilbeau for reviewing Part I and making many useful sug‐
gestions. And I also wish to thank my father-in-law, Michel Tessier, former mathe‐
matics teacher and now a great translator of Anton Chekhov, for helping me iron out
some of the mathematics and notations in this book and reviewing the linear algebra
Jupyter notebook.
And of course, a gigantic “thank you” to my dear brother Sylvain, who reviewed every
single chapter, tested every line of code, provided feedback on virtually every section,
and encouraged me from the first line to the last. Love you, bro!
Many thanks as well to O’Reilly’s fantastic staff, in particular Nicole Tache, who gave
me insightful feedback, always cheerful, encouraging, and helpful. Thanks as well to
Marie Beaugureau, Ben Lorica, Mike Loukides, and Laurel Ruma for believing in this
project and helping me define its scope. Thanks to Matt Hacker and all of the Atlas
team for answering all my technical questions regarding formatting, asciidoc, and
LaTeX, and thanks to Rachel Monaghan, Nick Adams, and all of the production team
for their final review and their hundreds of corrections.
Last but not least, I am infinitely grateful to my beloved wife, Emmanuelle, and to our
three wonderful kids, Alexandre, Rémi, and Gabrielle, for encouraging me to work
hard on this book, asking many questions (who said you can’t teach neural networks
to a seven-year-old?), and even bringing me cookies and coffee. What more can one
dream of?
xx | Preface
PART I
The Fundamentals of
Machine Learning
CHAPTER 1
The Machine Learning Landscape
When most people hear “Machine Learning,” they picture a robot: a dependable but‐
ler or a deadly Terminator depending on who you ask. But Machine Learning is not
just a futuristic fantasy, it’s already here. In fact, it has been around for decades in
some specialized applications, such as Optical Character Recognition (OCR). But the
first ML application that really became mainstream, improving the lives of hundreds
of millions of people, took over the world back in the 1990s: it was the spam filter.
Not exactly a self-aware Skynet, but it does technically qualify as Machine Learning
(it has actually learned so well that you seldom need to flag an email as spam any‐
more). It was followed by hundreds of ML applications that now quietly power hun‐
dreds of products and features that you use regularly, from better recommendations
to voice search.
Where does Machine Learning start and where does it end? What exactly does it
mean for a machine to learn something? If I download a copy of Wikipedia, has my
computer really “learned” something? Is it suddenly smarter? In this chapter we will
start by clarifying what Machine Learning is and why you may want to use it.
Then, before we set out to explore the Machine Learning continent, we will take a
look at the map and learn about the main regions and the most notable landmarks:
supervised versus unsupervised learning, online versus batch learning, instance-
based versus model-based learning. Then we will look at the workflow of a typical ML
project, discuss the main challenges you may face, and cover how to evaluate and
fine-tune a Machine Learning system.
This chapter introduces a lot of fundamental concepts (and jargon) that every data
scientist should know by heart. It will be a high-level overview (the only chapter
without much code), all rather simple, but you should make sure everything is
crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s
get started!
3
Random documents with unrelated
content Scribd suggests to you:
way. That is all, Johnny, darling, ‘the conclusion of the whole
matter,’—just to rest on His love.”
“Mamma,” said Johnny, holding his mother fast in a long, close
hug, “I don’t think I ever loved Him so much as I do to-night; and I
don’t think I’ll ever be really worried, or not long, anyhow, when
things seem to go crosswise again.”
CHAPTER XXII.
THE WAY OF ESCAPE.
POOR KATY.
The effect of this talk upon Jim was very marked. He began, from
that time, shyly to take Mrs. Leslie into his confidence, whenever he
felt that she could help him, and he schooled himself to bear,
without wincing, any and all allusions to the various and unobtrusive
acts of kindness which he was able to perform. And he very soon
had the encouragement of finding his usefulness greatly increased,
while he still had the satisfaction of doing many things which were
known only to himself and those whom he helped. To his firm and
resolute character, the plan of the campaign was more than half the
battle, while Johnny, who was naturally more heedless and forgetful,
found great difficulty in keeping his good resolutions where he could
find them in a hurry.
He had, for the time being, quite forgotten this talk about the wise
men, when, one day during the following week, as he was playing
with the boys at recess, a little girl strayed into the playground, with
a basket of apples and cakes, hoping to sell some of her wares to
the schoolboys. Johnny remembered her at once, for she was one of
the many people whom Mrs. Leslie had helped and befriended; she
had found the poor child in great trouble and destitution, a few
months before, and had put her to board with an old woman who
only demanded a very moderate amount of work in payment for the
care which she gave the little girl.
Katy employed her spare time in trying to sell
whatever she could pick up most cheaply,
whenever she had a few cents at her command;
matches, sometimes, and what Tiny called
“dreadful” cakes of soap; very thick china
buttons, blunt pins, or, when she had not enough
even for these investments, a few apples or
oranges, and unpleasant-looking cakes.
She was a solemn and anxious-looking child,
and although, through Mrs. Leslie’s care and
teaching, her clothes were nearly always whole and clean, they had
a look of not belonging to her, and Tiny and Johnny, while they
pitied her very much, and were always willing to help her in any way
they could, did not admire her.
It had never before occurred to her to visit the playground with
her basket, a fact over which Johnny had secretly rejoiced, and it
was with a feeling of dismay quite beyond the occasion that he saw
her come in at the gate. She did not see him, just at first, and he
was attacked, as he afterward told Tiny, with a mean desire to “cut
and run.” Before he could make up his mind to do this, however, she
recognized him, and a smile broke over her solemn countenance.
“Why!” she said, in the drawl which always “aggravated” Johnny,
“I didn’t know you went to school here, Johnny Leslie! I’m right glad
I came in. Don’t you want to buy an apple? And don’t some of these
other boys want to? They’re real nice—I tried one.”
“I haven’t any money here, Katy,” said Johnny, briefly, “and I don’t
believe the other boys have, either. And I wouldn’t come here,
again, if I were you; it’s not a good place to sell things at all—at
least, some things,” he added hastily, as he remembered how a
basketful of pop-corn candy had vanished in that very yard, a few
days before.
Katy’s face grew solemn again, and she was turning to go, with
the meekness which, to Johnny, was another of her offences. But a
few of the boys who were standing near, and who had heard the
conversation, saw how anxious Johnny was to get rid of her, and one
of them called out mockingly, loud enough to be heard all over the
playground,—
“Boys! Here’s a young lady friend of Johnny Leslie’s, with some
wittles to sell! His friends in this crowd ought to patronize her!”
The mischief was done, now; the boys flocked around Katy, and
being, most of them, good-natured fellows, as boys go, they said
nothing unmannerly to her, but they contrived, in their politely
worded remarks, which she did not in the least understand, to sting
Johnny to the verge of desperation. And yet, when he thought it
over afterwards, nothing had been said which was really worth
minding; it was the manner, not the matter, and the mocking
laughter, which had roused him.
“I think your friends are real nice, Johnny Leslie,” said Katy, as she
turned, with her empty basket, and her hand full of small coins, to
leave the yard, “and I won’t come back, if you don’t like me to, but I
don’t see why you don’t!” and she walked dejectedly away.
But before she reached the gate, Johnny had fought his battle—
and won it. He sprang after her, and held open the gate, as he
would have done for his mother, saying, loud enough for every one
to hear him,—
“I’m glad you’ve had such good luck, Katy! Come back every day,
if you like, and you wait for me here after school, and I’ll show you a
first-rate place to buy things, where the man won’t cheat you!”
She thanked him all too profusely, as she went slowly through the
gate, and then he turned, feeling that his face was fiery red, to
receive the volley which he fully expected, and had braced himself to
bear. But it was not exactly the sort of volley for which he was
prepared.
“Hurrah for Johnny Leslie!” called one of the little boys; the others
caught it up with a deafening cheer, and an unusual amount of
“tiger,” and Johnny saw that they were quite in earnest.
And then came back to his mind once more the words which had
so often come there, since he had read the quaint and beautiful
story of “The Pilgrim’s Progress from this world to a better,”—“The
lions were chained.”
The fact was, several of the boys had heard about Katy through
Tiny and their sisters, but they could not, or rather would not, resist
the temptation to tease Johnny, when they saw the foolish
annoyance which her coming had caused him. It has often been
noticed how a word, or even a look, will turn the tide, in affairs like
this, and even in much larger ones, and Johnny’s bold championship
of Katy had done this at once.
It was a good day for her when she invaded the playground, for
Johnny kept his word about showing her where to buy, and, knowing
as he did the things which would be most likely to sell well, the
result was that, after a few lessons, poor little Katy, who was slow
rather than stupid, began to show real judgment in her purchases.
She was always modest and quiet in her manner to the boys, and
the result of this was that their chaffing never passed the bounds of
harmless fun. They called her “The Daughter of the Regiment,” and
threatened her with dire penalties, should she not always come “first
and foremost” to their playground with her new stock.
“I’ve often thought, Tiny,” said Johnny, long afterward, when Katy
had made and saved enough to buy a second-hand counter, have
shelves put in the front room of the two which she and the old
woman occupied, and start a small but promising business. “I’ve
often thought of how it would have been if I had cut and run. And it
seems to me that the ‘way of escape’—about temptations, you know
—is right straight ahead!”
CHAPTER XXIII.
THE CIRCULAR CITY.