Introduction to Machine Learning with Applications in Information Security 1st Edition Mark Stamp - The full ebook with complete content is ready for download
Introduction to Machine Learning with Applications in Information Security 1st Edition Mark Stamp - The full ebook with complete content is ready for download
com
https://textbookfull.com/product/introduction-to-machine-
learning-with-applications-in-information-security-1st-
edition-mark-stamp/
OR CLICK HERE
DOWLOAD EBOOK
https://textbookfull.com/product/introduction-to-machine-learning-
with-r-rigorous-mathematical-analysis-first-edition-burger/
textbookfull.com
https://textbookfull.com/product/an-introduction-to-statistical-
learning-with-applications-in-r-gareth-james/
textbookfull.com
https://textbookfull.com/product/fundamentals-of-optimization-theory-
with-applications-to-machine-learning-gallier-j/
textbookfull.com
https://textbookfull.com/product/fundamentals-of-optimization-theory-
with-applications-to-machine-learning-gallier-j-2/
textbookfull.com
Artificial Intelligence With an Introduction to Machine
Learning 2nd Edition Richard E. Neapolitan
https://textbookfull.com/product/artificial-intelligence-with-an-
introduction-to-machine-learning-2nd-edition-richard-e-neapolitan/
textbookfull.com
https://textbookfull.com/product/introduction-to-machine-learning-
with-python-a-guide-for-data-scientists-1st-edition-andreas-c-muller/
textbookfull.com
https://textbookfull.com/product/a-first-course-in-machine-learning-
second-edition-mark-girolami/
textbookfull.com
https://textbookfull.com/product/introduction-to-machine-learning-
with-python-a-guide-for-data-scientists-andreas-c-muller/
textbookfull.com
https://textbookfull.com/product/machine-learning-and-security-
protecting-systems-with-data-and-algorithms-first-edition-chio/
textbookfull.com
INTRODUCTION TO
MACHINE
LEARNING with
APPLICATIONS
in INFORMATION
SECURITY
Chapman & Hall/CRC
Machine Learning & Pattern Recognition Series
SERIES EDITORS
This series reflects the latest advances and applications in machine learning and pattern rec-
ognition through the publication of a broad range of reference works, textbooks, and hand-
books. The inclusion of concrete examples, applications, and methods is highly encouraged.
The scope of the series includes, but is not limited to, titles in the areas of machine learning,
pattern recognition, computational intelligence, robotics, computational/statistical learning
theory, natural language processing, computer vision, game AI, game theory, neural networks,
computational neuroscience, and other relevant topics, such as machine learning applied to
bioinformatics or cognitive science, which might be proposed by potential contributors.
PUBLISHED TITLES
BAYESIAN PROGRAMMING
Pierre Bessière, Emmanuel Mazer, Juan-Manuel Ahuactzin, and Kamel Mekhnacha
UTILITY-BASED LEARNING FROM DATA
Craig Friedman and Sven Sandow
HANDBOOK OF NATURAL LANGUAGE PROCESSING, SECOND EDITION
Nitin Indurkhya and Fred J. Damerau
COST-SENSITIVE MACHINE LEARNING
Balaji Krishnapuram, Shipeng Yu, and Bharat Rao
COMPUTATIONAL TRUST MODELS AND MACHINE LEARNING
Xin Liu, Anwitaman Datta, and Ee-Peng Lim
MULTILINEAR SUBSPACE LEARNING: DIMENSIONALITY REDUCTION OF
MULTIDIMENSIONAL DATA
Haiping Lu, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos
MACHINE LEARNING: An Algorithmic Perspective, Second Edition
Stephen Marsland
SPARSE MODELING: THEORY, ALGORITHMS, AND APPLICATIONS
Irina Rish and Genady Ya. Grabarnik
A FIRST COURSE IN MACHINE LEARNING, SECOND EDITION
Simon Rogers and Mark Girolami
INTRODUCTION TO MACHINE LEARNING WITH APPLICATIONS IN
INFORMATION SECURITY
Mark Stamp
Chapman & Hall/CRC
Machine Learning & Pattern Recognition Series
INTRODUCTION TO
MACHINE
LEARNING with
APPLICATIONS
in INFORMATION
SECURITY
Mark Stamp
San Jose State University
California
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize
to copyright holders if permission to publish in this form has not been obtained. If any copyright material
has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information storage
or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted a
photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Preface xiii
Acknowledgments xvii
1 Introduction 1
1.1 What Is Machine Learning? . . . . . . . . . . . . . . . . . . . 1
1.2 About This Book . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Necessary Background . . . . . . . . . . . . . . . . . . . . . . 4
1.4 A Few Too Many Notes . . . . . . . . . . . . . . . . . . . . . 4
vii
viii CONTENTS
II Applications 235
Index 338
Preface
For the past several years, I’ve been teaching a class on “Topics in Information
Security.” Each time I taught this course, I’d sneak in a few more machine
learning topics. For the past couple of years, the class has been turned on
its head, with machine learning being the focus, and information security
only making its appearance in the applications. Unable to find a suitable
textbook, I wrote a manuscript, which slowly evolved into this book.
In my machine learning class, we spend about two weeks on each of the
major topics in this book (HMM, PHMM, PCA, SVM, and clustering). For
each of these topics, about one week is devoted to the technical details in
Part I, and another lecture or two is spent on the corresponding applica-
tions in Part II. The material in Part I is not easy—by including relevant
applications, the material is reinforced, and the pace is more reasonable.
I also spend a week covering the data analysis topics in Chapter 8 and
several of the mini topics in Chapter 7 are covered, based on time constraints
and student interest.1
Machine learning is an ideal subject for substantive projects. In topics
classes, I always require projects, which are usually completed by pairs of stu-
dents, although individual projects are allowed. At least one week is allocated
to student presentations of their project results.
A suggested syllabus is given in Table 1. This syllabus should leave time
for tests, project presentations, and selected special topics. Note that the
applications material in Part II is intermixed with the material in Part I.
Also note that the data analysis chapter is covered early, since it’s relevant
to all of the applications in Part II.
1
Who am I kidding? Topics are selected based on my interests, not student interest.
xiii
xiv PREFACE
Mark Stamp
Los Gatos, California
April, 2017
2
In my experience, in-person lectures are infinitely more valuable than any recorded or
online format. Something happens in live classes that will never be fully duplicated in any
dead (or even semi-dead) format.
About the Author
My work experience includes more than seven years at the National Security
Agency (NSA), which was followed by two years at a small Silicon Valley
startup company. Since 2002, I have been a card-carrying member of the
Computer Science faculty at San Jose State University (SJSU).
My love affair with machine learning began during the early 1990s, when
I was working at the NSA. In my current job at SJSU, I’ve supervised vast
numbers of master’s student projects, most of which involve some combination
of information security and machine learning. In recent years, students have
become even more eager to work on machine learning projects, which I would
like to ascribe to the quality of the book that you have before you and my
magnetic personality, but instead, it’s almost certainly a reflection of trends
in the job market.
I do have a life outside of work.3 Recently, kayak fishing and sailing my
Hobie kayak in the Monterey Bay have occupied most of my free time. I also
ride my mountain bike through the local hills and forests whenever possible.
In case you are a masochist, a more complete autobiography can be found at
http://www.sjsu.edu/people/mark.stamp/
If you have any comments or questions about this book (or anything else)
you can contact me via email at [email protected]. And if you happen
to be local, don’t hesitate to stop by my office to chat.
3
Of course, here I am assuming that what I do for a living could reasonably be classified
as work. My wife (among others) has been known to dispute that assumption.
xv
Acknowledgments
The first draft of this book was written while I was on sabbatical during the
spring 2014 semester. I first taught most of this material in the fall semester
of 2014, then again in fall 2015, and yet again in fall 2016. After the third
iteration, I was finally satisfied that the manuscript had the potential to be
book-worthy.
All of the students in these three classes deserve credit for helping to
improve the book to the point where it can now be displayed in public without
excessive fear of ridicule. Here, I’d like to single out the following students
for their contributions to the applications in Part II.
Topic Students
HMM Sujan Venkatachalam, Rohit Vobbilisetty
PHMM Lin Huang, Swapna Vemparala
PCA Ranjith Jidigam, Sayali Deshpande, Annapurna Annadatha
SVM Tanuvir Singh, Annapurna Annadatha
Clustering Chinmayee Annachhatre, Swathi Pai, Usha Narra
xvii
Chapter 1
Introduction
I took a speed reading course and read War and Peace in twenty minutes.
It involves Russia.
— Woody Allen
1
2 INTRODUCTION
the primary goal of this book is to provide the reader with a deeper un-
derstanding of what is actually happening inside those mysterious machine
learning black boxes.
Why should anyone care about the inner workings of machine learning al-
gorithms when a simple black box approach can—and often does—suffice? If
you are like your curious author, you hate black boxes, and you want to know
how and why things work as they do. But there are also practical reasons
for exploring the inner sanctum of machine learning. As with any technical
field, the cookbook approach to machine learning is inherently limited. When
applying machine learning to new and novel problems, it is often essential to
have an understanding of what is actually happening “under the covers.” In
addition to being the most interesting cases, such applications are also likely
to be the most lucrative.
By way of analogy, consider a medical doctor (MD) in comparison to a
nurse practitioner (NP).1 It is often claimed that an NP can do about 80%
to 90% of the work that an MD typically does. And the NP requires less
training, so when possible, it is cheaper to have NPs treat people. But, for
challenging or unusual or non-standard cases, the higher level of training of
an MD may be essential. So, the MD deals with the most challenging and
interesting cases, and earns significantly more for doing so. The aim of this
book is to enable the reader to earn the equivalent of an MD in machine
learning.
The bottom line is that the reader who masters the material in this book
will be well positioned to apply machine learning techniques to challenging
and cutting-edge applications. Most such applications would likely be beyond
the reach of anyone with a mere black box level of understanding.
sometimes skip a few details, and on occasion, we might even be a little bit
sloppy with respect to mathematical niceties. The goal here is to present
topics at a fairly intuitive level, with (hopefully) just enough detail to clarify
the underlying concepts, but not so much detail as to become overwhelming
and bog down the presentation.3
In this book, the following machine learning topics are covered in chapter-
length detail.
Topic Where
Hidden Markov Models (HMM) Chapter 2
Profile Hidden Markov Models (PHMM) Chapter 3
Principal Component Analysis (PCA) Chapter 4
Support Vector Machines (SVM) Chapter 5
Clustering (�-Means and EM) Chapter 6
Topic Where
�-Nearest Neighbors (�-NN) Section 7.2
Neural Networks Section 7.3
Boosting and AdaBoost Section 7.4
Random Forest Section 7.5
Linear Discriminant Analysis (LDA) Section 7.6
Vector Quantization (VQ) Section 7.7
Naı̈ve Bayes Section 7.8
Regression Analysis Section 7.9
Conditional Random Fields (CRF) Section 7.10
http://www.cs.sjsu.edu/~stamp/ML/
where you’ll find links to PowerPoint slides, lecture videos, and other relevant
material. An updated errata list is also available. And for the reader’s benefit,
all of the figures in this book are available in electronic form, and in color.
3
Admittedly, this is a delicate balance, and your unbalanced author is sure that he didn’t
always achieve an ideal compromise. But you can rest assured that it was not for lack of
trying.
4 INTRODUCTION
5
Chapter 2
A Revealing Introduction to
Hidden Markov Models
7
8 HIDDEN MARKOV MODELS
The bottom line is that this chapter is the linchpin for much of the remain-
der of the book. Consequently, if you learn the material in this chapter well,
it will pay large dividends in most subsequent chapters. On the other hand,
if you fail to fully grasp the details of HMMs, then much of the remaining
material will almost certainly be more difficult than is necessary.
HMMs are based on discrete probability. In particular, we’ll need some
basic facts about conditional probability, so in the remainder of this section,
we provide a quick overview of this crucial topic.
The notation “|” denotes “given” information, so that � (� | �) is read as
“the probability of �, given �.” For any two events � and �, we have
For example, suppose that we draw two cards without replacement from a
standard 52-card deck. Let � = {1st card is ace} and � = {2nd card is ace}.
Then
� (� and �) = � (�) � (� | �) = 4/52 · 3/51 = 1/221.
In this example, � (�) depends on what happens in the first event �, so we
say that � and � are dependent events. On the other hand, suppose we flip
a fair coin twice. Then the probability that the second flip comes up heads
is 1/2, regardless of the outcome of the first coin flip, so these events are
independent. For dependent events, the “given” information is relevant when
determining the sample space. Consequently, in such cases we can view the
information to the right of the “given” sign as defining the space over which
probabilities will be computed.
We can rewrite equation (2.1) as
� (� and �)
� (� | �) = .
� (�)
which comes from (2.3). For this example, suppose that the initial state
distribution, denoted by �, is
︀ ︀
� = 0.6 0.4 , (2.6)
that is, the chance that we start in the � state is 0.6 and the chance that
we start in the � state is 0.4. The matrices �, �, and � are row stochastic,
which is just a fancy way of saying that each row satisfies the requirements
of a discrete probability distribution (i.e., each element is between 0 and 1,
and the elements of each row sum to 1).
Now, suppose that we consider a particular four-year period of interest
from the distant past. For this particular four-year period, we observe the
series of tree ring sizes �, �, �, �. Letting 0 represent �, 1 represent � , and 2
represent �, this observation sequence is denoted as
︀ ︀
� = 0, 1, 0, 2 . (2.7)
We might want to determine the most likely state sequence of the Markov
process given the observations (2.7). That is, we might want to know the most
likely average annual temperatures over this four-year period of interest. This
is not quite as clear-cut as it seems, since there are different possible inter-
pretations of “most likely.” On the one hand, we could define “most likely”
as the state sequence with the highest probability from among all possible
state sequences of length four. Dynamic programming (DP) can be used to
efficiently solve this problem. On the other hand, we might reasonably define
“most likely” as the state sequence that maximizes the expected number of
correct states. An HMM can be used to find the most likely hidden state
sequence in this latter sense.
It’s important to realize that the DP and HMM solutions to this problem
are not necessarily the same. For example, the DP solution must, by defini-
tion, include valid state transitions, while this is not the case for the HMM.
And even if all state transitions are valid, the HMM solution can still differ
from the DP solution, as we’ll illustrate in an example below.
Before going into more detail, we need to deal with the most challenging
aspect of HMMs—the notation. Once we have the notation, we’ll discuss the
Exploring the Variety of Random
Documents with Different Content
Addressing the Answer.
Wedding Invitations.
Wedding Anniversaries.
Anniversary invitations require an answer, thus giving a very
pleasant opportunity for congratulating the happy couple. The
following forms are suitable:
Mr. and Mrs. Arthur Cummings accept with pleasure the kind
invitation of Mr. and Mrs. Kennet Wade for Thursday evening,
October tenth, and present their warmest congratulations on
their Silver Wedding Anniversary. 45 Church Street. Thursday.
For a refusal:
Invitations for these are written in the same form as for a dinner,
merely substituting the word “luncheon” or “supper” for “dinner,”
and should be accepted or refused in precisely the same style.
Answers also should be sent with the same promptness that the
hostess may be certain of arranging her table satisfactorily.
Other Invitations.
“C
OURTSHIP,”
according to
Sterne,
“consists in a
number of quiet
attentions, not so
pointed as to alarm,
nor so vague as not
to be understood.”
In this little quotation lies the spirit and the letter of all etiquette
regarding courtship. The passion of love generally appearing to
everyone save the man who feels it, so entirely disproportionate to
the value of the object, so impossible to be entered into by any
outside individual, that any strong expressions of it appear ridiculous
to a third person. For this reason it is that all extravagance of feeling
should be carefully repressed as an offense against good breeding.
Man was made for woman, and woman equally for man. How shall
they treat each other? How shall they come to understand their
mutual relations and duties? It is lofty work to write upon this
subject what ought to be written. Mistakes, fatal blunders, hearts
and lives wrecked, homes turned into bear-gardens, tears, miseries,
blasted hopes, awful tragedies—can you name the one most prolific
cause of all these?
If our young people were taught what they ought to know—if it
were told them from infancy up—if it were drilled into them and they
were made to understand what now is all a mystery to them—a
dark, vague, unriddled mystery—hearts would be happier, homes
would be brighter, lives would be worth living and the world would
be better.
“Good Night! Good Night! Parting is such sweet sorrow, That I shall say
good night till it be morrow.”
A POLITE ESCORT.
This is now the matter—matter grave and serious enough—which
we have in hand. There are gems of wisdom founded on health,
morality, happiness, which should be put within reach of every
household in our whole broad land. It is a most important, yet
neglected subject. People are squeamish, cursed with mock
modesty, ashamed to speak with their lips what their Creator spoke
through their own minds and bodies when he formed them. It is
time such nonsense—nonsense shall we say?—rather say it is time
such fatal folly were withered and cursed by the sober common
sense and moral duty of universal society.
Courtship! Its theme, how delightful! Its memories and
associations, how charming! Its luxuries the most luxurious proffered
to mortals! Its results how far reaching, and momentous! No mere
lover’s fleeting bauble, but life’s very greatest work! None are
equally portentous, for good and evil.
Errors of Love-Making.
Choice of Associates.
First Steps.
Character.
Disposition.
Trifling.
Proposals of Marriage.
The proposal itself is a subject so closely personal in its nature
that each man must be a law unto himself in the matter, and time
and opportunity will be his only guides to success, unless, mayhap,
his lady-love be the braver of the two and help him gently over the
hardest part, for there be men and men; some who brook not “no”
for an answer, and some that a moment’s hesitation on the part of
the one sought would seal their lips forever.
A woman must always remember that a proposal of marriage is
the highest honor that a man can pay her, and, if she must refuse it,
to do so in such fashion as to spare his feelings as much as possible.
If she be a true and well-bred woman, both proposal and refusal will
be kept a profound secret from every one save her parents. It is the
least balm she can offer to the wounded pride of the man who has
chosen her from out all women to bear his name and to reign in his
home. A wise woman can almost always prevent matters from
coming to the point of a declaration, and, by her actions and her
prompt acceptance of the attentions of others, should strive to show
the true state of her feelings.
A gentleman should usually take “no” for an answer unless he be
of so persevering a disposition as to be determined to take the fort
by siege; or unless the “no” was so undecided in its tone as to give
some hope of finding true the poet’s words:
“He gave them but one tongue to say us, ‘Nay,’
And two fond eyes to grant.”
On the gentleman’s part, a decided refusal should be received as
calmly as possible, and his resolve should be in no way to annoy the
cause of all his pain. If mere indifference be or seem to be the origin
of the refusal, he may, after a suitable length of time, press his suit
once more; but if an avowed or evident preference for another be
the reason, it becomes imperative that he should at once withdraw
from the field. Any reason that the lady may, in her compassion, see
fit to give him as cause for her refusal, should ever remain his
inviolable secret.
SOCIAL PASTIME ON RETURN VOYAGE
DECLINED WITH REGRETS.
As whatever grows has its natural period for maturing, so has
love. At engagement you have merely selected, so that your
familiarity should be only intellectual, not affectional. You are yet
more acquaintances than companions. As sun changes from
midnight darkness into noonday brilliancy, and heats, lights up, and
warms gradually, and as summer “lingers in the lap of spring;” so
marriage should dally in the lap of courtship. Nature’s adolescence of
love should never be crowded into a premature marriage. The more
personal, the more impatient it is; yet to establish its Platonic aspect
takes more time than is usually given it; so that undue haste puts it
upon the carnal plane, which soon cloys, then disgusts.
Unbecoming Haste.
“Dear Sir: Your proffer of your hand and heart in marriage has
been duly received, and its important contents fully considered.
“I accept your offer: and on its only condition, that I
reciprocate your love, which I do completely; and hereby both
offer my own hand and heart in return, and consecrate my
entire being, soul and body, all I am and can become, to you
alone; both according you the ‘privilege’ you crave of loving me,
and ‘craving’ a like one in return.
“Thank Heaven that this matter is settled; that you are in very
deed mine, while I am yours, to love and be loved by, live and
be lived with and for; and that my gushing affections have a
final resting-place on one every way so worthy of the fullest
reciprocal sympathy and trust.
“The preliminaries of our marriage we will arrange whenever
we meet, which I hope may be soon. But whether sooner or
later, or you are present or absent, I now consider myself as
wholly yours, and you all mine; and both give and take the
fullest privilege of cherishing and expressing for you that whole-
souled love I find even now gushing up and calling for
expression. Fondly hoping to hear from and see you soon and
often, I remain wholly yours forever,
C. D.”
The vow and its tangible witnesses come next. All agreements
require to be attested; and this as much more than others as it is
the most obligatory. Both need its unequivocal and mutual
mementos, to be cherished for all time to come as its perpetual
witnesses. This vow of each to the other can neither be made too
strong, nor held too sacred. If calling God to witness will strengthen
your mutual adjuration, swear by Him and His throne, or by
whatever else will render it inviolable, and commit it to writing, each
transcribing a copy for the other as your most sacred relics, to be
enshrined in your “holy of holies.”
Two witnesses are required, one for each. A ring for her and locket
for him, containing the likeness of both, as always showing how they
now look, or any keepsake both may select, more or less valuable,
to be handed down to their posterity, will answer.
Your mode of conducting your future affairs should now be
arranged. Though implied in selection, yet it must be specified in
detail. Both should arrange your marriage relations; say what each
desires to do, and have done; and draw out a definite outline plan of
the various positions you desire to maintain towards each other. Your
future home must be discussed: whether you will board, or live in
your own house, rented, or owned, or built, and after what pattern;
or with either or which of your parents. And it is vastly important
that wives determine most as to their domiciles; their internal
arrangements, rooms, furniture, management; respecting which
they are consulted quite too little, yet cannot well be too much.
Family rules, as well as national, state, corporate, financial, must
be established. They are most needed, yet least practiced in
marriage. Without them, all must be chaotic. Ignoring them is a
great but common marital error. The Friends wisely make family
method cardinal.
A Full Understanding.
Important Trifles.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com