0% found this document useful (0 votes)

7 views

1 Intro

Uploaded by

caiyuzhu.cs

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

1 Intro

Uploaded by

caiyuzhu.cs

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

教师介绍

机器学习盛律

Machine Learning - 邮箱：[email protected]

- 办公室：工程训练中心东419室
第一讲：机器学习概论 - 主页：https://lucassheng.github.io

盛律/软件学院
2024 秋冬学期

1 2

助教介绍课程说明
责任助教
n 新设立的专业学位核心课（48学时），10多个学院联合开设
n 本学期开课学院
p 计算机、仪器光电、软件、人工智能、网安
n 每个学院教学内容和考核方式可能略有不同，不同学院的成
绩标准可能有差异
黄泽桓文皓樊红兴王立芃陈泽人
[email protected] [email protected] [email protected] [email protected] [email protected]
n 强烈建议来自计算机、仪器光电、人工智能、网安等学院的同
学联系本学院教务老师，重新选择本学院开设的机器学习课程
n 对课程有任何问题，都可以联系助教咨询

3 4
课程沟通和资源课程目标
n 课程微信群 →_→
p 入群请将“群昵称”改为 n 掌握机器学习的基本理论与当前进展
“学号-姓名”
n 课程资源：智学北航
n 能够运用机器学习解决实际问题

n 为相关科学研究和工程实践夯实基础

5 6

参考书目内容大纲
序号教学内容序号教学内容
1 机器学习概述（3学时） 10 关联规则学习（2学时）
2 机器学习基础（3学时） 11 概率图模型（3学时）
3 线性模型（3学时） 12 采样方法（2学时）
4 正则化与稀疏学习（2学时） 13 决策树（2学时）
5 支持向量机与核方法（3学时） 14 集成学习（2学时）
6 神经网络（3学时） 15 半监督学习（2学时）
7 深度神经网络（6学时） 16 强化学习（3学时）
8 聚类（3学时） 17 应用案例分析（3学时）
Pattern Recognition 统计学习方法 Machine Learning 机器学习 9 降维（3学时）
and Machine Learning
Christopher M. Bishop 李航 Tom M. Mitchell 周志华 • 前置课程：工科数学分析、高等代数、概率统计等

7 8
考核方法
n 平时成绩（ 65%）
p 4次作业（10%+15%+20%+20%）
• 理论计算/推导 + 代码实现
n 期末考试成绩（ 35%） What is Machine Learning?
p 1个小论文，单人独立完成
• 论文题目：题目自拟，需要结合机器学习算法和个人研究方向
• 格式要求： CVPR模板，6页双栏正文（不包括参考文献）
• 申优答辩： 5分钟PPT视频，带语音解说
p 期中提交题目和简要介绍，期末提交论文
n 没有补考！！！
9 10

Herbert A. Simons (司马贺) Herbert A. Simons (司马贺)

- Economist, Political Scientist, Cognitive Psychologist - Economist, Political Scientist, Cognitive Psychologist
- Awarded: Nobel Prize in Economics in 1978, Turing Award in 1975 - Awarded: Nobel Prize in Economics in 1978, Turing Award in 1975

Learning denotes changes in the system that are Machine Learning denotes automatic changes in the
adaptive in the sense that they enable the system to AI system that are adaptive in the sense that they
do the same task (or tasks drawn from a population of enable the system to do the same task (or tasks drawn
similar tasks) more effectively the next time. from a population of similar tasks) more effectively the
next time.
-- Why should Machine Learning?
Machine Learning: An Artificial Intelligence Approach

Learning = Improve with experience at some task

https://en.wikipedia.org/wiki/Herbert_A._Simon https://en.wikipedia.org/wiki/Herbert_A._Simon
11 12
Machine Learning Quiz
- A computer program is said to learn from n Suppose your email program watches which emails you do or
experience ! with respect to some class of do not mark as spam, and based on that learns how to better
tasks " and performance measure #, if its filter spam. What is the task $ in this setting?
performance at tasks in $, as measured by %,
improved with experience !
p Classifying emails as spam or not spam
p Watching you label emails as spam or not spam
n Machine learning depends on the nature of p The number (or fraction) of emails correctly classified as spam/not
p Tasks ! we wish the system to learn spam
Machine Learning p Performance measure " we use to evaluate the system p None of the above—this is not a machine learning problem
Tom M. Mitchell
p Training signal or experience # we give it

13 14

ML in A Probabilistic Perspective ML and AI?

n All unknown quantities as random variables
n Probability distribution describes a weighted set of
possible values the variable may have

n Why?
p Optimal approach to decision making under uncertainty
p Probabilistic modeling is the language used by most other
areas of science and engineering, and thus provides a
unifying framework between these fields
15 16
ML for Daily Life
Optical Character Recognition (OCR)

Machine Learning
is Everywhere
Mail digit recognition, AT&T labs
http://www.research.att.com/~yann/

License plate readers

http://en.wikipedia.org/wiki/Automatic_number_plat
e_recognition
17 18

ML for Daily Life ML for Daily Life

Applications on Faces Biometrics

Fingerprint scanners on many new Face unlock on Apple iPhone X

smartphones and other devices See also http://www.sensiblevision.com/

19 20
ML for Daily Life ML for Social Media & Industry & E-
Speech Recognition & Translation Commerce

https://www.microsoft.com/en-us/research/video/speech- n Predict stock prices, improve search, reduce spam, improve advertiser
recognition-breakthrough-for-the-spoken-translated-
word/?from=https%3A%2F%2Fptop.only.wip.la%3A443%2Fhttp%2Fresearch.microsoft.com%2Fapps%
return on investment, ...
2Fvideo%2F%3Fid%3D175450
n Machine learning typically generates 10+% improvements

21 22

ML for Social Media & Industry & E-

ML for Energy
Commerce
n Predict how efficiently its data
n Search, photo tagging, ranking articles to centers consume electricity
your news feed
n Input: Total server IT load, total
n Product recommendation, eCommerce fraud number of condenser water
detection, forecasting demand, pricing pumps running, mean heat
exchange approach
temperature, outdoor wind
n Predict whether a customer will cancel a
speed, ...
service and jump to a competitor
Google Data Center

23 24
ML for Touching New Horizons New Breakthrough
Generative Foundation Models
n Identifying stars, supernovae, clusters, galaxies, quasars,
exoplanets, etc.
n AlphaGo/AlphaZero series：超越人类最佳棋手
n AlphaFold：已预测出98.5%的蛋白质结构

Generative Pre-trained Transformer (GPT) Generative Image Modeling, & etc.

“Large Models + Generative Modeling” makes a big difference!

25 26

New Breakthrough An AI What We Expect

Generative Foundation Models

27 28
The State of ML & AI: We are Really, The State of ML & AI: We are Really,
Really Far Really Far

CS231n: Convolutional
Neural Networks for Visual
Recognition

http://karpathy.github.io/2012/10/22/state-of-computer-vision/ 29 http://karpathy.github.io/2012/10/22/state-of-computer-vision/ 30

General Pipeline

Testing
Machine Learning Input
System
Pipelines and Algorithms Samples

Learning
Algorithms
Training

31 32
Basic Concepts of Data Basic Concepts of Data Observations/Samples

i.i.d. 1
2
Labels
3 (supervised)
Training Set
4
5
6
Yann LeCun (杨立昆) 7
MNIST dataset 8
http://yann.lecun.com/exdb/mnist/ 9
Testing Set
0
33 34

Features/Attributes/Representation Common Learning Algorithms

n Raw data signals may be too difficult to be Supervised Learning Unsupervised Learning
learned

Continuous Discrete
p Image pixels, audio waveforms, etc.
n Features, or called attributes, representations, are Classification Clustering
extracted to describe the samples conceptually
Wid
th

n Different types of fish differ in

g th

Length, lightness, width, number of fins, shape of fins, Dimension

Len

p
position of mouth, ... Regression
Reduction
f = [length, lightness, width, . . . ]
35 36
Supervised Learning Unsupervised Learning
n The learner is provided with a set of inputs together n Training examples as input
with the corresponding desired outputs patterns, with no associated
outputs
Has a teacher!
No teacher!
- Teaching kids to recognize different
animals - Similarity measure exists to detect
- Graded examinations with corrected groups/clustering
answers provided

37 38

How are They Different? Pipeline of Supervised Learning

Supervised Learning Unsupervised Learning Texts,
Feature
Rewrite the general pipeline
Texts vectors
Images,
Texts
n Predictive Model n Clustering (finite groups) Images
Videos,
Images
n Classification (discrete labels) n Probability distribution estimation Videos
Audios, ...
Videos
Audios
n Regression (continuous values) n Finding association in features Audios Machine
Learning
n Dimension reduction
Decision boundaries Labels Algorithm
Labels
Labels
Training Phase

Testing Phase
New
Features of Text,
Feature Predictive Expected
sample points Image,
vector Model label/value
Video,
Audio, …
39 40
Pipeline of Unsupervised Learning More Learning Algorithms
Feature n Supervised Learning n Semi-supervised Learning
Texts, Rewrite the general pipeline
Texts vectors p Partially labelled and unlabeled
Images,
Texts
Images
Videos,
samples
Images
Videos p Find decision boundaries for the
Audios, ...
Videos
Audios complete training samples
Audios Machine
Learning
Algorithm

Training Phase n Unsupervised Learning

Testing Phase
New
Likelihood,
Text,
Feature Cluster Id,
Image, Model
vector Better
Video,
representation
Audio, …
41 42

More Learning Algorithms More Learning Algorithms

n Reinforcement Learning n Multi-label Learning: tags, text categorization, gene functions, ...
n Multi-instance Learning: Content-based image retrieval, ...
p Training examples as input- n Multi-task Learning: ask a set of related subtasks for help
output pairs
n Transfer Learning: learning in one context to another context
p Trying to increase the
reinforcement it receives n Federated Learning: private protecting, distributed machine
learning
p Graded examinations with only n Lifelong Learning: learning without forgetting previous knowledge
overall scores but no correct n In-context Learning: in the scenario of large models
answers
n …

43 44
Training and Testing

Machine Learning Training set

(Observed)

Training, Evaluations and

l s
m ode
ng
ea rn i
in l

Goals
Tra Data are drawn from a
same distribution
Ev
a lu
a te
le a
rn e
Full dataset dm
od Testing set
e ls (UNSEEN in the training phase)

45 46

Example: Polynomial Curve Fitting Example: Polynomial Curve Fitting

n Supervised Learning
Ground truth curve t = sin(2⇡x)
n Continuous Regression
Samples

Sum-of-squares Error as the

training objective:
N
Polynomial estimation: 1X
E(w) = {y(xn , w) tn }2
PM 2 n=1
y(x, w) = w0 + w1 x + w2 x2 + . . . + wM xM = j=0 w j xj
Optimize it!
! is user-defined
47 48
Example: Polynomial Curve Fitting Example: Polynomial Curve Fitting
n Evaluation metric n Evaluate the learned model 1
p on the testing set Training
ERMS (w) = 2E(w)/N
n Over-fitting when the Test
dimension of model
n Larger dimensions of parameter is too large
the model parameter

ERMS
0.5
n Smaller RMS about
the training samples No-free-lunch!!
n Need to make some
assumptions or biases
How to fairly evaluate
the model performance? 0
0 3 M 6 9
49 50

No Free Lunch Theorem Evaluation Criteria

All models are wrong, but some models are useful. n There are several factors that affect the performance:
p Types of training provided
p The form and extent of any initial background knowledge
n There is no single best model that works optimally for all kinds
of problems p The type of feedback provided
p The learning algorithms used
n A set of assumptions (also called inductive bias) that works
well in one domain may work poorly in another
n The best way to pick a suitable model is based on domain
knowledge, and/or trial and error
Modeling & Optimization

51 52
<latexit sha1_base64="wyLxvIcjqYIERsTgasjH7UdL9BY=">AAACP3icbVA9SytBFJ1Vn8bo06ilzWAQYhN2BdEyIMKr1AcmCtkQZid34+Ds7DpzNxiW7fxHr7Gw8S9oZfsaC0W0tHPyIbynHhg4nHMud+4JEikMuu69MzE59WN6pjBbnJv/ubBYWlpumDjVHOo8lrE+CZgBKRTUUaCEk0QDiwIJx8HZ7sA/7oE2IlZH2E+gFbGuEqHgDK3ULjX22j7CBWZxinmlu0F9Cef0QxRqpCUR9SOGp5zJ7CCv+OZcY+aHmvGsM472eJ5n+7kvFd3PN9qlslt1h6BfiTcm5drmn+2ru9fLw3bp1u/EPI1AIZfMmKbnJtjKmEbBJeRFPzWQMH7GutC0VLEITCsb3p/Tdat0aBhr+xTSofrvRMYiY/pRYJODK8xnbyB+5zVTDHdatoUkRVB8tChMJcWYDsqkHaGBo+xbwrgW9q+UnzJbC9rKi7YE7/PJX0ljs+ptVd3fXrlWIyMUyCpZIxXikW1SI7/IIakTTq7JX/JInpwb58F5dl5G0QlnPLNC/oPz9g4wQ7VN</latexit>

q
dvc
Eout (g)  Ein (g) ± O( ln N )
Goals Goals N

Under-fitting Over-fitting
n Supervised
Eout
<latexit sha1_base64="ilTpUVRBqj0Fz/OyfcUgNGsu6vk=">AAAB83icbVDLSsNAFJ3UV62vapdugkVwISURRJcBEVxWsA9oQplMJ+3QySTM3BFLyMaPcONCEbf+jAvBP/AbXDl9LLT1wIXDOfdy7z1hypkCx/m0CkvLK6trxfXSxubW9k55d6+pEi0JbZCEJ7IdYkU5E7QBDDhtp5LiOOS0FQ4vxn7rlkrFEnEDo5QGMe4LFjGCwUj+ZdcHegdZoiHvlqtOzZnAXiTujFS9ysf3/Vd+XO+W3/1eQnRMBRCOleq4TgpBhiUwwmle8rWiKSZD3KcdQwWOqQqyyc25fWiUnh0l0pQAe6L+nshwrNQoDk1njGGg5r2x+J/X0RCdBxkTqQYqyHRRpLkNiT0OwO4xSQnwkSGYSGZutckAS0zAxFQyIbjzLy+S5knNPa05127V89AURbSPDtARctEZ8tAVqqMGIihFD+gJPVvaerRerNdpa8GazVTQH1hvPz8Qljw=</latexit>

Minimize Eout or maximize probabilistic terms

<latexit sha1_base64="JzXtvQUKE3hnULMWijZsBnN8wEg=">AAAB83icbVDLSgNBEJz1GeMr0ZPkMhgET2FXED14CIjgMQHzgOwSZiezyZDZnWWmRwxLfsOLB0W8+isevOnXOHkcNLGgoajqprsrTAXX4Lpfzsrq2vrGZm4rv72zu7dfKB40tTSKsgaVQqp2SDQTPGEN4CBYO1WMxKFgrXB4PfFb90xpLpM7GKUsiEk/4RGnBKzk33R9YA+QSQPjbqHsVtwp8DLx5qRcLfql7/rRR61b+PR7kpqYJUAF0brjuSkEGVHAqWDjvG80Swkdkj7rWJqQmOkgm948xidW6eFIKlsJ4Kn6eyIjsdajOLSdMYGBXvQm4n9ex0B0GWQ8SQ2whM4WRUZgkHgSAO5xxSiIkSWEKm5vxXRAFKFgY8rbELzFl5dJ86zinVfculeuXqEZcqiEjtEp8tAFqqJbVEMNRFGKHtEzenGM8+S8Om+z1hVnPnOI/sB5/wFnvpTc</latexit>

p
<latexit sha1_base64="e/s25t6lpPR7f1yCcLOjsAr73Aw=">AAACLHicbVBNSxxBEO3R+LV+rQa8eGmyKHpZZgTRi7DgJbmIgqvCzjj09NasjT09Y3eNuDTzg7zkf4QcBMkhIl495g/kkt5dD/HjQcHjvSqq6iWFFAZ9/8EbG/80MTk1PVObnZtfWKwvLZ+YvNQc2jyXuT5LmAEpFLRRoISzQgPLEgmnyeX+wD+9Bm1Ero6xX0CUsZ4SqeAMnRTX90OEG7Sgda4rukfDVDNug8oeVDQ0ZRZbtRdU5wc0zBheJIn9VnX6saKhgiva27iJ1WYU1xt+0x+CvifBC2m01vmPPz//rhzG9fuwm/MyA4VcMmM6gV9gZJlGwSVUtbA0UDB+yXrQcVSxDExkh89WdM0pXZrm2pVCOlT/n7AsM6afJa5zcLJ56w3Ej7xOieluZIUqSgTFR4vSUlLM6SA52hUaOMq+I4xr4W6l/IK5uNDlW3MhBG9ffk9OtprBdtM/ChqtFhlhmqySL2SDBGSHtMhXckjahJNbckd+kwfvu/fLe/SeRq1j3svMZ/IK3vM/GvesgQ==</latexit>

PN
Error rate error = I[yn 6= g(xn )] 1
N n=1
<latexit sha1_base64="wyLxvIcjqYIERsTgasjH7UdL9BY=">AAACP3icbVA9SytBFJ1Vn8bo06ilzWAQYhN2BdEyIMKr1AcmCtkQZid34+Ds7DpzNxiW7fxHr7Gw8S9oZfsaC0W0tHPyIbynHhg4nHMud+4JEikMuu69MzE59WN6pjBbnJv/ubBYWlpumDjVHOo8lrE+CZgBKRTUUaCEk0QDiwIJx8HZ7sA/7oE2IlZH2E+gFbGuEqHgDK3ULjX22j7CBWZxinmlu0F9Cef0QxRqpCUR9SOGp5zJ7CCv+OZcY+aHmvGsM472eJ5n+7kvFd3PN9qlslt1h6BfiTcm5drmn+2ru9fLw3bp1u/EPI1AIZfMmKbnJtjKmEbBJeRFPzWQMH7GutC0VLEITCsb3p/Tdat0aBhr+xTSofrvRMYiY/pRYJODK8xnbyB+5zVTDHdatoUkRVB8tChMJcWYDsqkHaGBo+xbwrgW9q+UnzJbC9rKi7YE7/PJX0ljs+ptVd3fXrlWIyMUyCpZIxXikW1SI7/IIakTTq7JX/JInpwb58F5dl5G0QlnPLNC/oPz9g4wQ7VN</latexit>

q
Eout (g)  Ein (g) ± O( dNvc ln N )
How about & increases?
Testing Training

n Unsupervised: <latexit sha1_base64="DuJkz7Ovdrdr82nWDgkwgd1bTyQ=">AAAB8nicbVDLSsNAFJ3UV62vapduBovgQkoiiC4DIrisYB+QhjKZTtqhk0mYuRFLyMZ/cONCEbd+jQvBP/AbXDl9LLT1wIXDOfdy77lBIrgG2/60CkvLK6trxfXSxubW9k55d6+p41RR1qCxiFU7IJoJLlkDOAjWThQjUSBYKxhejP3WLVOax/IGRgnzI9KXPOSUgJG8y24H2B1kXObdctWu2RPgReLMSNWtfHzff+XH9W75vdOLaRoxCVQQrT3HTsDPiAJOBctLnVSzhNAh6TPPUEkipv1scnKOD43Sw2GsTEnAE/X3REYirUdRYDojAgM9743F/zwvhfDcN3mSFJik00VhKjDEeJwf97hiFMTIEEIVN7diOiCKUDBfKpknOPORF0nzpOac1uxrp+q6aIoi2kcH6Ag56Ay56ArVUQNRFKMH9ISeLbAerRfrddpasGYzFfQH1tsPUdqVsQ==</latexit>

Ein
p Minimum quantization error, Minimum distance, MAP, MLE
53 54

Components of Generalization Error Example: Mean-squared Error

n Bias
E(MSE) = noise2 + bias2 + variance
<latexit sha1_base64="HU0X1ESAWHF/s4pusJYu9jNUh90=">AAACKXicbVBNSyNBEO1xdzUbV43r0UuzQVCEMCOIe1kYEGEvC4pGA0kMNZ2KNvb0DN01Yhjm73jxr3jZBZd1r4L4M+xkcogfDxpevVdFdb0oVdKS7997Mx8+fpqdq3yuzn9ZWFyqLX89tklmBDZFohLTisCikhqbJElhKzUIcaTwJLrYHfknl2isTPQRDVPsxnCm5UAKICf1auHeeofwivJfh3vFBv/By0on0mJxusU3J0IkwU7Xl2AkaIFFr1b3G/4Y/C0JJqQe8oOnx8rc9n6v9qfTT0QWoyahwNp24KfUzcGQFAqLaiezmIK4gDNsO6ohRtvNx5cWfM0pfT5IjHua+FidnsghtnYYR64zBjq3r72R+J7XzmjwvZtLnWaEWpSLBpnilPBRbLwvDQpSQ0dAGOn+ysU5GBDkwq26EILXJ78lx1uNYLvhHwT1MGQlKmyVfWPrLGA7LGQ/2T5rMsGu2S27Y3+9G++398/7X7bOeJOZFfYC3sMzttepRQ==</latexit>

p How much the average model over all training sets differ
from the true model?
p Error due to inaccurate assumptions / simplifications made Error due to Error due to
Unavoidable
by the model incorrect variance of
error
assumptions training samples

n Variance n See the following for explanations of bias-variance

p How many models estimated from different training sets p Z. Zhou, Machine Learning
differ from each other? p C.M. Bishop, Patter Recognition and Machine Learning
p http://www.inf.ed.ac.uk/teaching/courses/mlsc/Notes/Lecture4/BiasVariance.pdf

55 56
Quiz Under-fitting v.s. Over-fitting
PM n Under-fitting
y(x, w) = w0 + w1 x + w2 x2 + . . . + wM xM = j=0 w j xj
p Model is too “simple” to represent all the
n Given a hyperparameter $, if trained by different training sets. How will relevant class characteristics
the predicted curves vary? p High bias and low variance
p High training error and high testing error

n Over-fitting
p Model is too “complex” and fits irrelevant
characteristics (noise) in the data
p Low bias and high variance
p Low training error and high testing error
57 58

Regularization Regularization
n Consider the prior knowledge about the model and the task n Punish coefficients with large values
N
<latexit sha1_base64="FCVL4Cbteyi7N9L7h889A/JbdUA=">AAACa3icbVHPaxQxGM1Mq9ZV62oPlerhw1JYUZeZBalQCgtW8KQV3La4mR0ymUwbmskMScZ2ibn4J3rrqdde/B/M7Pawtn4QeN/7fuTlJasF1yaKLoJwafnO3Xsr9zsPHj5afdx98vRAV42ibEQrUamjjGgmuGQjw41gR7VipMwEO8xOP7T1wx9MaV7Jb2Zas6Qkx5IXnBLjqbT7CxsucmY/uh4uiTnJCnvmXsEu4EIRamNnBw7rpkyt3I3d5DNgC9PeeSrfwGL/WzCpBOwmA3gNeCbLKpa7+RYsvKCctLsA/1wY9NlkkHY3o340C7gN4muwOdxLvlzu5N/30+5vnFe0KZk0VBCtx3FUm8QSZTgVzHVwo1lN6Ck5ZmMPJSmZTuxMlIMtz+RQVMofaWDGLk5YUmo9LTPf2crUN2st+b/auDHF+8RyWTeGSTq/qGgEmApa4yHnilEjph4QqrjXCvSEeHuM/56ONyG++eTb4GDQj9/1o6/ejSGaxwp6jl6iHorRNhqiT2gfjRBFV8FqsB48C/6Ea+FG+GLeGgbXM2vonwi3/gJAob0Q</latexit>

1X
Ẽ(w) = {y(xn , w) tn }2 + kwk2
2 n=1 2

p Examining the values of % when overfitting?

p Small magnitude? Sparsity? The numerical stability of solving linear
systems?

59 60
Regularization Regularization
n Punish coefficients with large values n Punish coefficients with large values

61 62

<latexit sha1_base64="wyLxvIcjqYIERsTgasjH7UdL9BY=">AAACP3icbVA9SytBFJ1Vn8bo06ilzWAQYhN2BdEyIMKr1AcmCtkQZid34+Ds7DpzNxiW7fxHr7Gw8S9oZfsaC0W0tHPyIbynHhg4nHMud+4JEikMuu69MzE59WN6pjBbnJv/ubBYWlpumDjVHOo8lrE+CZgBKRTUUaCEk0QDiwIJx8HZ7sA/7oE2IlZH2E+gFbGuEqHgDK3ULjX22j7CBWZxinmlu0F9Cef0QxRqpCUR9SOGp5zJ7CCv+OZcY+aHmvGsM472eJ5n+7kvFd3PN9qlslt1h6BfiTcm5drmn+2ru9fLw3bp1u/EPI1AIZfMmKbnJtjKmEbBJeRFPzWQMH7GutC0VLEITCsb3p/Tdat0aBhr+xTSofrvRMYiY/pRYJODK8xnbyB+5zVTDHdatoUkRVB8tChMJcWYDsqkHaGBo+xbwrgW9q+UnzJbC9rKi7YE7/PJX0ljs+ptVd3fXrlWIyMUyCpZIxXikW1SI7/IIakTTq7JX/JInpwb58F5dl5G0QlnPLNC/oPz9g4wQ7VN</latexit>

q
dvc
Regularization Eout (g)  Ein (g) ± O(
Maximum Likelihood
N ln N )
Cross Validation
n Increase the number of training samples n If the training set is small, how to choose reasonable
hyperparameters?
N
<latexit sha1_base64="FCVL4Cbteyi7N9L7h889A/JbdUA=">AAACa3icbVHPaxQxGM1Mq9ZV62oPlerhw1JYUZeZBalQCgtW8KQV3La4mR0ymUwbmskMScZ2ibn4J3rrqdde/B/M7Pawtn4QeN/7fuTlJasF1yaKLoJwafnO3Xsr9zsPHj5afdx98vRAV42ibEQrUamjjGgmuGQjw41gR7VipMwEO8xOP7T1wx9MaV7Jb2Zas6Qkx5IXnBLjqbT7CxsucmY/uh4uiTnJCnvmXsEu4EIRamNnBw7rpkyt3I3d5DNgC9PeeSrfwGL/WzCpBOwmA3gNeCbLKpa7+RYsvKCctLsA/1wY9NlkkHY3o340C7gN4muwOdxLvlzu5N/30+5vnFe0KZk0VBCtx3FUm8QSZTgVzHVwo1lN6Ck5ZmMPJSmZTuxMlIMtz+RQVMofaWDGLk5YUmo9LTPf2crUN2st+b/auDHF+8RyWTeGSTq/qGgEmApa4yHnilEjph4QqrjXCvSEeHuM/56ONyG++eTb4GDQj9/1o6/ejSGaxwp6jl6iHorRNhqiT2gfjRBFV8FqsB48C/6Ea+FG+GLeGgbXM2vonwi3/gJAob0Q</latexit>

1X
Ẽ(w) = {y(xn , w) tn }2 + kwk2
2 n=1 2

n Split the dataset into training, testing & validation

sets
p Data augmentation
63 64
Cross Validation Cross Validation

n Split data into folds, try each fold as validation and

average the result

65 66

Questions?

Lecture B1 - Overview and Intro
No ratings yet
Lecture B1 - Overview and Intro
86 pages
Wiring Diagram Heater
No ratings yet
Wiring Diagram Heater
8 pages
1-Introduction
No ratings yet
1-Introduction
81 pages
CS550_Lec1
No ratings yet
CS550_Lec1
32 pages
ML 01
No ratings yet
ML 01
15 pages
Intro To Machine Learning
100% (1)
Intro To Machine Learning
250 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
81 pages
01 Introduction
No ratings yet
01 Introduction
23 pages
ML Short U1-4
No ratings yet
ML Short U1-4
60 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Lecture 1
No ratings yet
Lecture 1
34 pages
L21 Intro ML
No ratings yet
L21 Intro ML
30 pages
Machine Learning: Martin Jaggi & Nicolas Flammarion
No ratings yet
Machine Learning: Martin Jaggi & Nicolas Flammarion
52 pages
Course Logistics and Introduction: CSN-526 Machine Learning
No ratings yet
Course Logistics and Introduction: CSN-526 Machine Learning
23 pages
Lecture 1
No ratings yet
Lecture 1
51 pages
Lecture 1
100% (1)
Lecture 1
81 pages
CS550 Lec1
No ratings yet
CS550 Lec1
35 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
37 pages
Syl3 ML
No ratings yet
Syl3 ML
5 pages
GenerativeAI ML Roadmap
No ratings yet
GenerativeAI ML Roadmap
26 pages
Machine Learning 101
No ratings yet
Machine Learning 101
3 pages
ML Overview
No ratings yet
ML Overview
26 pages
Machine Learning
100% (1)
Machine Learning
46 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
63 pages
Introduction To Machine Learning: Pekka Parviainen
No ratings yet
Introduction To Machine Learning: Pekka Parviainen
39 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
60 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
FINAL UNIT 4
No ratings yet
FINAL UNIT 4
107 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
AppliedMachineLearning S12023 24
No ratings yet
AppliedMachineLearning S12023 24
5 pages
Helsenki - Intro To ML
No ratings yet
Helsenki - Intro To ML
35 pages
Overview of machine learning
No ratings yet
Overview of machine learning
60 pages
Introduction to machine learning
No ratings yet
Introduction to machine learning
33 pages
Week01 Intro AI
No ratings yet
Week01 Intro AI
53 pages
Lahore University of Management Sciences CS 535/EE 514 Machine Learning
No ratings yet
Lahore University of Management Sciences CS 535/EE 514 Machine Learning
3 pages
Mathematics for ML
No ratings yet
Mathematics for ML
12 pages
ML 23 First Lectures 2 3 v0.1
No ratings yet
ML 23 First Lectures 2 3 v0.1
66 pages
All Into One ML
No ratings yet
All Into One ML
432 pages
Lesson 4 -Introduction Machine Learning
No ratings yet
Lesson 4 -Introduction Machine Learning
44 pages
ML_Concepts&Algorithms
No ratings yet
ML_Concepts&Algorithms
193 pages
ML Full Syllabus
No ratings yet
ML Full Syllabus
576 pages
01 Intro To ML Wo Videos
No ratings yet
01 Intro To ML Wo Videos
46 pages
ML-cahp-1
No ratings yet
ML-cahp-1
35 pages
Machine Learning Unit 1
No ratings yet
Machine Learning Unit 1
30 pages
Chapter 1
No ratings yet
Chapter 1
62 pages
Revised Handout 15ec3054 MLC
No ratings yet
Revised Handout 15ec3054 MLC
18 pages
CS360 ML Syllabus - 12102022
No ratings yet
CS360 ML Syllabus - 12102022
5 pages
Lecture 1 -Intro
No ratings yet
Lecture 1 -Intro
63 pages
ML - Week 1
No ratings yet
ML - Week 1
37 pages
Computational Intelligence: (Introduction To Machine Learning)
No ratings yet
Computational Intelligence: (Introduction To Machine Learning)
55 pages
The Core Language: Adding List Elements
No ratings yet
The Core Language: Adding List Elements
4 pages
1 Lecture 1: Introduction To Machine Learning
No ratings yet
1 Lecture 1: Introduction To Machine Learning
12 pages
LM #01-Introduction To ML
No ratings yet
LM #01-Introduction To ML
33 pages
Erick Myers - Python Machine Learning is the Complete Guide to Everything You Need to Know About Python Machine Learning_ Keras, Numpy, Scikit Learn, Tensorflow, With Useful Exercises and Examples. (2
50% (2)
Erick Myers - Python Machine Learning is the Complete Guide to Everything You Need to Know About Python Machine Learning_ Keras, Numpy, Scikit Learn, Tensorflow, With Useful Exercises and Examples. (2
175 pages
COMP323 - Topic C - Introduction To Machine Learning 1
No ratings yet
COMP323 - Topic C - Introduction To Machine Learning 1
20 pages
ML Lecture 1 Introduction and Policies
No ratings yet
ML Lecture 1 Introduction and Policies
45 pages
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
The Deep Learning Engineer's Handbook: From Fundamentals to Advanced Techniques with Scikit-Learn, Keras, and TensorFlow
From Everand
The Deep Learning Engineer's Handbook: From Fundamentals to Advanced Techniques with Scikit-Learn, Keras, and TensorFlow
Aarav Joshi
No ratings yet
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
From Everand
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Jyh-Horng Jeng
No ratings yet
The Environment of Business
No ratings yet
The Environment of Business
36 pages
Alternate Fuels & Engine
No ratings yet
Alternate Fuels & Engine
3 pages
A1 ENG - Pronunciation Rules
No ratings yet
A1 ENG - Pronunciation Rules
2 pages
Qlink Legacy 250 User Manual
No ratings yet
Qlink Legacy 250 User Manual
44 pages
Odia Grammar
No ratings yet
Odia Grammar
7 pages
We Didnt Mean To Go To Sea Summary
No ratings yet
We Didnt Mean To Go To Sea Summary
1 page
Exception List
No ratings yet
Exception List
2 pages
Ta Cover Letter
100% (1)
Ta Cover Letter
6 pages
Pampanga High School High School Boulevard, Barangay Lourdes, City of San Fernando, Pampanga
No ratings yet
Pampanga High School High School Boulevard, Barangay Lourdes, City of San Fernando, Pampanga
6 pages
Lecture 8
No ratings yet
Lecture 8
3 pages
A Review in High Early Strength Concrete and Local PDF
No ratings yet
A Review in High Early Strength Concrete and Local PDF
10 pages
ASSIGNMENT 4.1 - Valuation of Bonds and Stocks
No ratings yet
ASSIGNMENT 4.1 - Valuation of Bonds and Stocks
3 pages
HCL
No ratings yet
HCL
8 pages
PWP Project
No ratings yet
PWP Project
12 pages
NI CaseStudy Cs 16265
No ratings yet
NI CaseStudy Cs 16265
3 pages
XLookUp - 5 Examples
No ratings yet
XLookUp - 5 Examples
20 pages
European Private International Law Geert Van Calster - The latest updated ebook version is ready for download
100% (1)
European Private International Law Geert Van Calster - The latest updated ebook version is ready for download
57 pages
Basic Kite Flying
No ratings yet
Basic Kite Flying
3 pages
Kywoo Cura Installation & Operation Manual-V3.0
No ratings yet
Kywoo Cura Installation & Operation Manual-V3.0
20 pages
Paver Block Road 200 FT Manakdiha
No ratings yet
Paver Block Road 200 FT Manakdiha
1 page
ECU-Magazine_July-2024
No ratings yet
ECU-Magazine_July-2024
16 pages
Korean Language 2B SB Sample
No ratings yet
Korean Language 2B SB Sample
21 pages
RockEU Deliverable D3.1.1
No ratings yet
RockEU Deliverable D3.1.1
11 pages
CS C-15 3 and 4
No ratings yet
CS C-15 3 and 4
144 pages
DiscoverNus and Nus ASEAN Masters Scholarship
No ratings yet
DiscoverNus and Nus ASEAN Masters Scholarship
6 pages
United States Navy Aircraft Since 1911 PDF
100% (10)
United States Navy Aircraft Since 1911 PDF
312 pages
OKI C9600 C9800 Parts Manual
67% (3)
OKI C9600 C9800 Parts Manual
38 pages
ZT B50800D 10P Brochure
No ratings yet
ZT B50800D 10P Brochure
1 page
First Quarter Exam Tle 7 Cookery
100% (2)
First Quarter Exam Tle 7 Cookery
8 pages