0% found this document useful (0 votes)

17 views

L02 Fundamentals of ML

Uploaded by

X.y I.L

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

L02 Fundamentals of ML

Uploaded by

X.y I.L

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

UCCD2063

Artificial Intelligence Techniques

Unit 2:
The Fundamentals of Machine Learning
Outline
• Machine Learning
Reference:
• Introduction
• Types of machine learning
• Challenges of Machine Learning
• The Machine Learning Framework

Géron Chapter 1
Problems with Traditional Programming
Traditional programming paradigm:
Deploy

Study the
Write rules Evaluate
problem

Analyze errors

▪ Problems of traditional programming paradigm:

• complexity – too many rules, very hard to cover all aspects of the
problem, different problems require different rules
• static – cannot adapt to new input, need to keep writing new rules,
very hard to maintain

3
The Machine Learning Framework
▪ Instead of handcraft rules, ML learns a model from training samples
(data)
▪ One learning algorithm for different problems

Training Deploy
Samples
Collect data

Study the Training ML Evaluate

problem algorithm Model Solution

Analyze errors

4
What is Machine Learning?

“Machine Learning: Field of study

that gives computers the ability to
learn (from data) without being
explicitly programmed. ”
[Arthur Samuel (1959)]

5
Why use Machine Learning?
▪ Some problems are too complex to solve by using rules
• An example: image classification

Input Complex, specific Output

program “car”
if this local region looks like a side mirror,
and
if this local region area looks like a wheel
then classify image as a car

Will work if given the same image

again, but, given new images, the
algorithm is bound to fail

6
Why use machine learning?

▪ Good for problems that evolves with time

• Machine learning can automatically rebuild the model when
necessary
• Example: spam classifier – learns new spam words when it become
unusually frequent in spam flagged by users

Training Update
Deploy
Samples data

Can be
updated
Study the Train ML Evaluate
problem algorithm Solution

Analyze errors

7
Why use machine learning?

▪ Help humans learn

• Can inspect ML to see what they learn
− may find unsuspected correlations or new trends
− learn the problem better
− Example: can examine the list of words or its combination that ML identifies as the
best predictors of spam filter

Lots of
training samples

Study the Train ML

Solution
problem algorithm

Iterate if Understand the Inspect the

necessary problem better solution

8
Data in Machine Learning
▪ Numerical (quantitative) data
• Discrete (e.g. 1, 3, 8, -4, …)
• Continue (e.g. 2.321, 0.2437, …)
▪ Categorical (qualitative) data
• Ordinal (e.g. low, medium, high)
• Nominal (e.g. red, blue, yellow)

9
Data in Machine Learning
▪ Labelled data vs unlabelled data
• Labelled data: data that comes with a tag (e.g. name, value)
• Unlabelled data: data that comes with no tag

10
Structured Data vs Unstructured Data

Structured Data
• Specific and stored in a
predefined format
• Suitable for traditional
machine learning
• Focused in this course

Unstructured
Data
• Collection of varied types of
data that are stored in their
native formats (e.g. text,
image, video, audio,…)
• Better result with deep
learning techniques 11
Type of Machine Learning Algorithms

Labeled Unlabeled Mixed No Training

Data Data Data Data

12
Supervised Learning

▪ In supervised learning, the algorithm is given some example

input-output pair and it learns a function that maps from input
to output
▪ input: the set of features used to describe the samples
▪ output: the attribute we are interested to predict
▪ Example: Fruit classification
Input Output
Features
length width weight Label
fruit 1 165 38 172 Banana
fruit 2 218 39 230 Function Banana
Samples
fruit 3 76 80 145 Orange
fruit 4 145 35 150 Banana
… … … … …
fruit m … … … …

13
Supervised Learning Tasks

Two main supervised learning tasks

Classification Regression
Classification predicts discrete Regression predicts continuous
valued output (e.g., present/not valued output (e.g., house price)
present) 400

300
Price (RM x1000)
Yes No 200

100

0 Size
0 500 1000 1500 2000 2500
Object Detection (Images with Car)
Housing Price Prediction

14
Supervised Learning – Classification Applications
Income classification:
• Input: numerical and categorical data
• Output: 1 (income<=50K), 0 (income>50K) – discrete
• Features: age, workclass, marital-status, race, education, area, …

15
More Classification Applications

Digit Classification Input (Image): Output (discrete):

▪ Input: images / pixel grids 0
▪ Output: a digit 0-9
Feature 1
▪ Features:
Extraction
• Signatures
2
• Histogram of gradients
• Shape Patterns: NumComponents, 1
AspectRatio, NumLoops
• … ??

Spam mail classification Input (Text): Output (discrete):

▪ Input: an email Dear Andy,
▪ Output: spam or non-spam How you are doing, buddy? Hopefully you
are adapting well to your new school. All
▪ Features: of us miss you dearly here. We miss your
silly jokes.
• Words: FREE!, Earn, Call now,..
• Text Patterns: $$$, ALL CAPS Hello, I have a special offer for you…
WANT TO LOSE WEIGHT? The most
• Non-text: SenderInContacts powerful weight loss is now available
• … without prescription.

16
Supervised Learning – Regression Applications
Predict the house price in the Boston area (regression):
• Input: numerical and categorical data
• Output: house price (in 1000usd) – continue
• Features:
− CRIM: per capita crime rate by town
− ZN: proportion of residential land
− INDUS: proportion of non-retail business acres per town
− CHAS: Charles River dummy variable (= 1 if near river; 0 otherwise)
− NOX: nitric oxides concentration (parts per 10 million)
− RM: average number of rooms per dwelling
− AGE: proportion of units built prior to 1940
− DIS: weighted distance to employment centres
− RAD: index of accessibility to radial highways
− TAX: full-value property-tax rate per $10,000
− …

17
A Simple Supervised Learning Example (1/6)

▪ Problem: want to predict IT salary

▪ Step 1: Collect Data
• Consult domain expert/survey/research what key factors
(features) affecting IT salary
− Experience in years (used in this example)
− Job title
− Size of organization
Samples
− Gender X (year) y (salary, k)
− Industry sector 5 38
10 64
− Geographic region 7 36
− ... 8 44
0 na
• Collect the data … …
− From survey/database
− Data processing for missing/incomplete data, normalization (see L03)

18
A Simple Supervised Learning Example (2/6)

▪ Step 2: Model selection

• Study the data and determine what model is suitable
− For this plot, select linear regression
𝑦 = 𝜃0 + 𝜃1 × 𝑥 𝜃0 , 𝜃1 = model parameters

y
(salary)

X (year)

ChatGPT has 175 billion parameters, GPT4 > 1 trillion parameters 19

A Simple Supervised Learning Example (3/6)

▪ Step 3: Train model

• Separate data into training set (80%) and test set (20%)
− Train model on training set, validate model using test set

Training set
Test set

y
(salary)

X (year)

20
A Simple Supervised Learning Example (4/6)

▪ Step 3: Train model (cont.)

• Train model with training set
− Use normal equation or gradient descent (see L05)
− Minimize sum-of-squared error (SSE) – training error

𝑦 = 𝜃෠0 + 𝜃෠1 𝑥
prediction error

y
(salary)

X (year)
21
A Simple Supervised Learning Example (5/6)

▪ Step 4: Validate model

• Validate the model using test set – test error

y
(salary)

X (year)

22
A Simple Supervised Learning Example (6/6)

▪ Step 5: Deploy model

• Use the trained model for prediction

𝑦 = 𝜃෠0 + 𝜃෠1 𝑥
y
(salary)

X (year)

23
Supervised Learning Algorithms

▪ Algorithms for classification:

• k-Nearest neighbour (k-NN)
• Logistic Regression
• Decision Tree and Random Forests
• Support Vector Machine (SVM)
• Neural Networks (NN)
• ...

▪ Algorithms for regression:

• K-NN Regressor
• Linear Regression
• Decision Tree and Random Forests
• SVM Regressor
• NN
• Non-linear Regression ...
24
Unsupervised learning

Unsupervised Learning Features

▪ No labels are provided for all training length width weight
fruit 1 165 38 172
samples fruit 2 218 39 230
▪ Discovers the underlying structure, fruit 3 76 80 145
fruit 4 145 35 150
relationship or patterns based only on … … … …
the features of the training sample fruit m … … …

Supervised Learning Unsupervised Learning

Positive
samples cluster
x x
x
vs
x2 x
x x2
Negative Decision Samples (no labels)
samples Boundary

x1
x1 25
Unsupervised Learning Task: Clustering

▪ Detect groups of similar samples

▪ Example:
Detecting groups of visitors who visit your blog
Example analysis:
Time of day • 40% visitors who love comic
books and read in the evening
night • 20% are young sci-fi lovers who
visit before school, etc.

day How does it help?

Age • Can target your posts for each
young -> old group

26
Unsupervised Learning Task: Anomaly detection
▪ Identify items, events or observations which do not conform
to an expected pattern or other items in a dataset
▪ Anomalies are also referred to as outliers, novelties or
noise

outliers

Applications: Bank fraud, medical problems or errors in a text. 27

Unsupervised Learning Task: Association Rule Mining

▪ Given a set of transactions, find rules that will predict the

occurrence of an item based on the occurrences of other items
in the transaction.

Support(Bread) = #Bread / #total

Support(Jam) = #Jam / #total
Support(Bread, Jam) = #(Bread+Jam) / #total

Confident(Bread->Jam) = #(Bread+Jam) / #Bread

Lift(Bread->Jam) = Support(Bread, Jam) /
(Support(Bread) x Support(Jam))

28
Unsupervised Learning Tasks and Algorithms

▪ Clustering
• k-Means
• Hierarchical Cluster Analysis (HCA)
• Expectation Maximization

▪ Association rule learning

• Apriori
• Eclat

29
Semi-supervised learning
Features
▪ Partially labeled training data. length width weight
Label

Typically more unlabeled data fruit 1 165 38 172 Banana

fruit 2 218 39 230 ?
than labeled fruit 3 76 80 145 Orange
fruit 4 145 35 150 ?
… … … … …
▪ Most semi-supervised fruit m … … … …

algorithms are combinations

of unsupervised algorithms
and supervised algorithms

30
Reinforcement Learning

▪ No training set is provided

▪ Learns based on the feedback of the environment:
• effect of the agent' action on the environment (state)
• rewards of taking a particular action
▪ Learns by itself the best policy (state-action) to get the
most reward over time
Policy

Applications: games (chess, go, video), robotics, traffic control, trading,... 31

Reinforcement Learning Example

The agent
1. observes the environment
2. Select action using policy

3. Perform action
4. Get reward or penalty

5. Update policy (learning step)

6. Iterate until an optimal policy
is found

32
Differences between Machine Learning Types

Supervised Learning Unsupervised Learning Reinforcement Learning

Labeled data with output Unlabeled data, output not Environment with rewards and
specified specified penalty

Solves problems by mapping Solves problems by discovering Solves problems by trial and
input to known output underlying patterns error

External supervision No supervision No supervision

Used for regression and Used for clustering and Used for control and decision
classification tasks association tasks making tasks

33
Foundation Model & Self-Supervised Learning
▪ Conventionally, a AI model is trained on task-specific data to
perform very specific task.
▪ A new paradigm in AI has emerged called foundation models.
Unlike traditional AI, foundation models learn from massive
datasets across different domains.
▪ Through self-supervised learning techniques, a foundation
model teach itself to acquire broad scope of knowledge and
understanding of the world (general intelligence).
▪ Large language models (LLM) such as OpenAI's GPT-4 and
Google's PaLM are examples of foundation models .
▪ The foundation models can then be transferred to perform any
other tasks through fine-tuning or prompting.
• GPT -> ChatGPT, GPT -> Copilot, GPT -> Duolingo
▪ Foundation models require a lot more data and computing
power to train.
34
Self-Supervised Learning
▪ Self-supervised learning is a new machine learning process
where the model trains itself to learn one part of the input
from another part of the input to obtain useful representations
and knowledge.
▪ The trained model can help with downstream learning tasks.

Random shaffle Ground truth

35
How LLMs are trained

Self-supervised learning
Source: Borealis AI 36
Challenges of Machine Learning

▪ Insufficient quantity of training data

▪ Poor data quality
▪ Non-representative training data
▪ Irrelevant features
▪ Underfitting & overfitting the training data

37
Poor-quality data
▪ Training data may contain errors, for example:

Outlier

▪ Data cleaning: Most data scientists spend a significant time to clean the
data. For example:
• Fill up missing value
• Drop a column (feature) with many missing values/errors
• Remove rows (samples) with outliers
• Fix error/format manually

38
Non-representative data

▪ Training data should be representative of the new cases that you want
to generalize to.
▪ Consider fitting a linear model to the GDP dataset with and without 7
missing countries :
(This model does not generalize well)
without missing
countries

with missing
countries

39
Irrelevant features
▪ Selected features must be relevant to the task at hand. Having
irrelevant features in your data can decrease the accuracy of
the models.
• For example, area or perimeter length are irrelevant feature for
classifying shapes like circle and rectangle.

f(area) = ?
• Features such as signature, number of corners are more suitable for
classifying shapes.

40
Feature Engineering

▪ Feature engineering is the process to come up with a

good set of features to train on. It involves:
• feature extraction – use some tools to extract features
from samples (e.g., extract shape signature, colors
from an object).
• feature selection – choose the most useful features
among all existing features that produce the best
result for a machine learning model.

Deep learning learns features automatically but requires lots of training data.

41
Underfitting & Overfitting
▪ Underfitting (high bias) may happen when our model is over-simplified
or not expressive enough (high training error and high test error).

▪ Overfitting (high variance) may happen when our model is too complex
and fits too specifically to the training set, but it does not generalize
well to new data (low training error but high test error).

= test data 42
Underfitting & Overfitting
▪ Underfitting and overfitting in regression task

43
Hyperparameter Tuning

▪ Hyperparameter is a parameter whose value is set before the learning

process begins and which is used to control the learning process.
• For example, % of train-test split is a hyperparameter
▪ Different model has different set of hyperparameters. For example,
• Polynomial model: degree
• K-NN: n_neighbors, distance metric (Manhattan or Euclidean)
• Neural Networks: α (learning rate), max_iter (maximum iterations)
• SVM: kernel (linear, rbf), C (penalty parameter)
▪ In machine learning, hyperparameter tuning is the process of choosing
a set of optimal hyperparameters for a learning algorithm.
▪ Need to balance between fitting the data perfectly and keeping the
model simple to ensure it generalizes well (avoid underfit & overfit)
▪ Hyperparameter tuning is an important step of building a machine
learning system.
44
The Machine Learning Framework
▪ Divided into two phases
Training Phase: Testing Phase:

Raw Data Data Preprocessing

(cleaning, normalization, dimension reduction, etc.)

Training Validation Testing

Set Set Set

Learning Test
Hyperparameter
Algorithm
Tuning & Tune
Validation

Train Model Final Model

Predict

45
Next:

The Regression Pipeline

L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
21CSC305P ML_ Unit 1-E.pptx
No ratings yet
21CSC305P ML_ Unit 1-E.pptx
137 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
Module 1
No ratings yet
Module 1
175 pages
Unit1-2
No ratings yet
Unit1-2
101 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
CHP 1
No ratings yet
CHP 1
47 pages
1 Lecture 1: Introduction To Machine Learning
No ratings yet
1 Lecture 1: Introduction To Machine Learning
12 pages
Unit_1_ML - Copy
No ratings yet
Unit_1_ML - Copy
96 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
CPCS335 - Chapter 8-Final
No ratings yet
CPCS335 - Chapter 8-Final
23 pages
Week 01
No ratings yet
Week 01
37 pages
ML 1
No ratings yet
ML 1
35 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
Machine Learning: BE Sixth Semester 20CS610
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
211 pages
Introduction To AI and ML - Day 1: Gururajan Narasimhan Erode
No ratings yet
Introduction To AI and ML - Day 1: Gururajan Narasimhan Erode
39 pages
1 Overview
No ratings yet
1 Overview
22 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Machine Learning
No ratings yet
Machine Learning
54 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Lecture 02
No ratings yet
Lecture 02
34 pages
Lecture 02
No ratings yet
Lecture 02
34 pages
Intro To ML
No ratings yet
Intro To ML
107 pages
ML Unit 1
No ratings yet
ML Unit 1
21 pages
Supervised and Deep Learning
No ratings yet
Supervised and Deep Learning
83 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
OR forecasting tool
No ratings yet
OR forecasting tool
39 pages
Unit I
No ratings yet
Unit I
44 pages
unit 01
No ratings yet
unit 01
32 pages
Big-Data Unit-3
100% (1)
Big-Data Unit-3
54 pages
AI lab6 (1)
No ratings yet
AI lab6 (1)
7 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
01 - ML - Introduction (1)
No ratings yet
01 - ML - Introduction (1)
65 pages
Practical # 9
No ratings yet
Practical # 9
4 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
ML - Week 1
No ratings yet
ML - Week 1
37 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
Machine Learning Slides
No ratings yet
Machine Learning Slides
46 pages
Lec1 -Introduction
No ratings yet
Lec1 -Introduction
55 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Lecture 12 Machine Learning
No ratings yet
Lecture 12 Machine Learning
40 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture
No ratings yet
An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture
55 pages
Support Vector Machine Dissertation
100% (2)
Support Vector Machine Dissertation
7 pages
Image Classification
No ratings yet
Image Classification
18 pages
Employing Machine Learning Techniques and Fuzzy Membership For Detecting Fraud Transactions in Credit Card
No ratings yet
Employing Machine Learning Techniques and Fuzzy Membership For Detecting Fraud Transactions in Credit Card
82 pages
Real-Time Tool Condition Monitoring With The Internet of Things and Machine Learning Algorithms
No ratings yet
Real-Time Tool Condition Monitoring With The Internet of Things and Machine Learning Algorithms
20 pages
Ou Anane 2013
No ratings yet
Ou Anane 2013
5 pages
Technical Seminar Qml[1]
No ratings yet
Technical Seminar Qml[1]
38 pages
Literature Review On Feature Selection Methods For HighDimensional Data
No ratings yet
Literature Review On Feature Selection Methods For HighDimensional Data
9 pages
An Efficient Method For Number Plate Detection and Extraction Using White Pixel Detection (WPD) Method
No ratings yet
An Efficient Method For Number Plate Detection and Extraction Using White Pixel Detection (WPD) Method
7 pages
IA anesthesiology (2)
No ratings yet
IA anesthesiology (2)
13 pages
Bajunaid 2017 Ijca 914112
100% (1)
Bajunaid 2017 Ijca 914112
4 pages
Botmudra - Hedge Funds - Compressed PDF
No ratings yet
Botmudra - Hedge Funds - Compressed PDF
21 pages
Underwater Mine & Rock Prediction by Evaluation of Machine Learning Algorithms
No ratings yet
Underwater Mine & Rock Prediction by Evaluation of Machine Learning Algorithms
13 pages
Word
No ratings yet
Word
6 pages
Deepfake and Beyond - A Survey of Face Manipulation and Fake Detection
No ratings yet
Deepfake and Beyond - A Survey of Face Manipulation and Fake Detection
23 pages
SVM based stock prediction analysis
No ratings yet
SVM based stock prediction analysis
7 pages
Crime Analyses Using Data Analytics
No ratings yet
Crime Analyses Using Data Analytics
15 pages
AuthorVersion PublishedTuningHyperparameters
No ratings yet
AuthorVersion PublishedTuningHyperparameters
30 pages
Machine Learning Is A Computer Vision
No ratings yet
Machine Learning Is A Computer Vision
7 pages
Project 3 - Diabetes Prediction.ipynb - Colab
No ratings yet
Project 3 - Diabetes Prediction.ipynb - Colab
4 pages
Two Marks - Aiml
No ratings yet
Two Marks - Aiml
21 pages
23MBA20163 Capstone Project
No ratings yet
23MBA20163 Capstone Project
58 pages
CRIMECAST: A Crime Prediction and Strategy Direction Service
No ratings yet
CRIMECAST: A Crime Prediction and Strategy Direction Service
5 pages
Pandey 2022 J. Phys. Conf. Ser. 2161 012027
No ratings yet
Pandey 2022 J. Phys. Conf. Ser. 2161 012027
13 pages
A Survey On Visual Content-Based Video Indexing and Retrieval
No ratings yet
A Survey On Visual Content-Based Video Indexing and Retrieval
23 pages
Bioengineering 09 00682
No ratings yet
Bioengineering 09 00682
23 pages
Introduction to Engineering Data Analysis
No ratings yet
Introduction to Engineering Data Analysis
20 pages
Report File (VJ)
No ratings yet
Report File (VJ)
56 pages
SVM Notes
No ratings yet
SVM Notes
8 pages
Non Technical Intro To ML
No ratings yet
Non Technical Intro To ML
26 pages