0% found this document useful (0 votes)

0 views

Introduction to machine learning

The document provides an overview of Machine Learning (ML), defining it as a subfield of artificial intelligence that enables computers to learn from data without explicit programming. It discusses various aspects of ML, including types of learning (supervised, unsupervised, reinforcement), data handling, performance measures, and applications in fields like computer vision and natural language processing. Additionally, it addresses challenges such as overfitting, data augmentation, and the prerequisites for understanding ML.

Uploaded by

achouriarij59

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

Introduction to machine learning

Uploaded by

achouriarij59

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Machine learning

Introduction

Mohamed FARAH

Année universitaire : 2024-2025

Machine Learning

 Machine Learning is the field of study that gives the

computer the ability to learn without being explicitly
programmed (Arthur Samuel (1959)
Machine Learning

 Machine Learning is:

 a subfield of artificial intelligence based on mathematical and
statistical approaches to empower computers to learn from data
 automatically resolves decision problems without explicit
programming
 relates to the design, optimisation and implementation of methods to
learn from past data in order to predict new observations

Machine Learning – new programming paradigm

Traditional Programming

Data
Computer Output
 Data driven Program

 Automating automation
 Getting computers to Machine Learning
programme themselves
Data
Computer Program
Output

4
Machine Learning

 A computer program is said to learn from experience E with respect to

some class of tasks T and performance measure P, if its performance
at tasks in T, as measured by P, improves with experience E (Tom
Mitchell, 1998)

 Example :
• Experience (data): games played by the program (with itself) Tom Mitchell

• Performance measure: winning rate

 Learning is the acquisition of the ability to perform the task

 How to learn is another type of problem and there are many methods

The Experience E / Data

 Most algorithms experience an entire dataset

 Dataset: A collection of examples, aka data points
 An example is a collection of features (data) that have been
quantitatively measured for some object/event that we
want the ML system to process

6
Data – Example
 Anderson’s Iris data (oldest dataset, in stat/ML 1936)

• Measurements of 150 iris flowers

- 4 attributes : sepal length, sepal width, petal length, petal width ∈ ×

• 3 species: Setosa, versicolor, virginica

https://en.wikipedia.org/wiki/Iris_flower_data_set

Data as vectors, matrices, tensors

 Tensors: generalization of matrices

to n dimensions (or rank, order, degree)
• 1D tensor: vector
• 2D tensor: matrix
• 3D, 4D, 5D tensors
Data
 Datasets decomposition
• Training set : data to train with
• Validation set : when to stop training
• Test or generalisation set : data to test on

 These datasets are all disjoint

Dataset Assumptions

 data are generated by a probability distribution

over the data
 Typically make i.i.d assumptions
• Samples are independent from each other
• Training and test sets are identically distributed
(drawn from the same distribution)
The Task T

 ML enables tackling tasks too difficult to solve with fixed

programs written and designed manually
 T is usually described in terms of how the machine learning
system should process an example
 NB. The process of learning itself is not the task

The Performance Measures, P

 P are specific to the task T

 Well known measures based on the confusion matrix

Accuracy
Precision
Recall
F-score
etc.

! Applied on data not seen before:

Test set ... not the training set

12
After the task is learned

 Processing of new data is called inference

 Computational costs during training (high) vs inference (lower)

Related Domains

 Statistics: learning theory, data mining, inference

 Computing: AI, computer vision, IR
 Engineering: signal, robotics, control
 Cognitive science, psychology, epistemology, neuroscience
 Economics: decision theory, game theory

14
Applications of Machine Learning

Computer Vision

 Image recognition, segmentation, classification, etc.

Model Cat or Dog

 Example : Recognition of handwritten characters

16
Computer Vision

 Example : Face detection

Computer Vision

 Example : Detection of pedestrians

Example of training images

18
Natural Language Processing (NLP)

 Example : Classification of Textual Documents.

Natural Language Processing (NLP)

 Example : Detection of spams in the emails.

Hint: Count the frequency and co-occurrence of certain keywords, e.g.

congratulations, lottery, win, prize, etc.

20
Natural Language Processing (NLP)

 Example : Automatic Translation.

“How are you?” Model “Wie geht’s dir?”

Translating machine

Natural Language Processing (NLP)

 Example : Recommendation Systems.

22
Natural Language Processing (NLP)

 Example : Chatbots

“How are you?” Model ‘I am fine thank you’

Conversational agent / chatbot

Bio-Informatics

 Sequence alignment, analysis of genetic data, etc.

 Example : Prediction of Caesarean Emergency

Conditions

24
Signal processing
 Speech recognition, person identification, speech to text, text to
speech, etc.

Model ‘Hello’

Speech recognition

Other areas of application

 Robotics: estimation of positions, of states, etc.

 Financial analysis: allocation of portfolio, credits, grants, etc.
 Medicine: diagnosis, treatment, design of therapies, etc.
 Graphic design: realistic design and simulations, etc.
 Social networks
 Content generation
 etc.

26
Learning Types

(based on tasks)

Learning Types

Supervised Unsupervised
Learning Learning

Reinforcement
Learning

28
Supervised learning

Supervised learning
 Given: a dataset that contains samples
1 , 1 ,…( , )
 Task: if a residence has square feet, predict its price?

15th sample
( 15 , 15 )

= 800
=?
Housing price prediction
Supervised learning
 Given: a dataset that contains samples
1 , 1 ,…( , )
 Task: if a residence has square feet, predict its price?

= 800
=?
Housing price prediction

Regression vs Classification
 regression: if ∈ ℝ is a continuous variable
 e.g., price prediction

 classification: the label is a discrete variable

 e.g., the task of predicting the types of residence

(size, lot size) → house or townhouse?

= house or
townhouse?
Supervised Learning – Model Types

2 types of models:

• Discriminative model:
• it is estimated that ( | )
• we're learning the decision boundary
• Generative model:
• it is estimated that ( │ ) is used to deduce ( | )
• we learn probability distributions of data 33

Supervised learning in Computer Vision

 Image Classification
 = raw pixels of the image, = the main object

ImageNet Large Scale Visual Recognition Challenge. Russakovsky et al.’2015

Supervised learning in Computer Vision

 Object localization and detection

 = raw pixels of the image, = the bounding boxes

ImageNet Large Scale Visual Recognition Challenge. Russakovsky et al.’2015

Supervised learning in Computer Vision

 Recognition of handwritten characters (OCR)

: values of intensities of pixels of the image.
: identity of the character (class).

36
Supervised learning in NLP
 Machine translation

Unsupervised learning

Also called Knowledge discovery

38
Unsupervised Learning

 Dataset contains no labels: 1 ,…

 Target is not explicitly known
 Goal (vaguely-posed): to find interesting structures /
patterns in the data

supervised unsupervised

Clustering

 k-mean clustering, mixture of Gaussians, etc.

Clustering

 k-mean clustering, mixture of Gaussians, etc.

Density Estimation

 learning the probability distribution having generated the

data.
• To generate new realistic data.
• To distinguish “realistic” data from “false” data (e.g. spam
filtering).
• Compression of data
• etc.

42
Density Estimation

 given a sample = , = 1. . from a distribution,

 obtain an estimate of the density function at any point.
 Parametric:
• Assume a parametric family of densities . ! (e.g., (", # $ )) and obtain
the best estimate !% of !
 Nonparametric:
• Obtain a good estimate of the
entire density directly from
the sample (e.g. Histogram)

Representation learning

 automatically extracting useful and significant characteristics

from raw data without labels.

 The aim is to transform the data into a more compact and

informative representation (embeddings) that facilitates
subsequent tasks such as classification or grouping.

44
Word Embedding

Rome
Represent words by vectors
Paris
encode Italy
 word vector
Berlin
encode
 relation direction France
Germany

Word2vec [Mikolov et al’13]

GloVe [Pennington et al’14]

Clustering Words with Similar Meaning

(Hierachically)

[Arora-Ge-Liang-M.-Risteski, TACL’17,18]
Dimensionality reduction

 reduce the number of variables or dimensions of the data,

while preserving the essential information.

“swiss roll”
dataset

47
https://link.springer.com/article/10.1007/s00477-016-1246-2/figures/1

Latent Semantic Analysis (LSA)

documents
words

 Principal Component Analysis (PCA) used in LSA

https://commons.wikimedia.org/wiki/File:Topic_
detection_in_a_document-word_matrix.gif
Large Language Models (LLM)
 machine learning models for language learnt on large-
scale language datasets
 can be used for many purposes

Language Models are Few-Shot Learners [Brown et al.’20]

https://openai.com/blog/better-language-models/

Reinforcement learning

50
Reinforcement learning

 Learning to make sequential decisions

 Chess
• 1997: Deep Blue (IBM) defeated world chess champion Garry Kasparov
in a six-game match.
• 2017: AlphaZero (DeepMind) defeated Stockfish (chess engine)

 Go
• 2016: AlphaGo (DeepMind) defeated 18-time world champion Lee
Sedol 4-1 in a five-game match.
• 2017: AlphaGo Master defeated world champion Ke Jie
• 2017: AlphaGo Zero (a more advanced version) surpassed all previous
versions

Reinforcement learning

 The algorithm can collect data interactively

Try the strategy and Data Improve the strategy

collect feedbacks
Training based on the
collection
feedbacks
Reinforcement learning

 Problem Data
 A state describes a situation
 An action allows you to switch between states
 A policy allows you to choose the action to be taken based on your
current state
 At the end of each action, a + or - reward is observed

 Objectives
 Guide an agent to define a policy: Improve the policy of choice of
action at time t+1
 Avoid failure situations
53

Challenges

 The ability to generalise a model

 Overfiting
 Underfitting

 Curse of dimensionality (Lots of features vs dataset size)

 Vanishing and exploding gradients (in Neural Networks
based models)
 Data not available
 Data augmentation (Reduced datasets)
 Imbalanced datasets
 etc. 54
Generalisation
 a major challenge of ML
• Ability to perform well on previously unseen outputs

 training error vs test / generalisation error

• training error: error on training input

• test / generalisation error: expected error on a new input

 ML training algorithm reduces training error, which is the task

of optimisation

 What differentiates ML from pure optimisation is that the test /

generalisation error needs to be low as well

Typical learning curve

Validation Loss

Training loss

Number of training steps

Overfitting
• A major problem for the learning techniques!

• One can find a hypothesis that makes a good prediction for

training data, but that does not generalise well for the rest
of the data.

• In the rest of the course, we will see methods to

mitigate the overfitting problem.

Vanishing and exploding gradients problem

 For Neural Networks based models

• Vanishing Gradients: Occur when the gradients of the loss
function with respect to the parameters become very small
during backpropagation. This prevents the weights from
updating effectively, slowing or halting learning, especially
in early layers of deep networks.
• Exploding Gradients: Occur when the gradients become
very large, causing unstable updates to the weights and
making training diverge.
Vanishing and exploding gradients problem

Data augmentation

What ?
 increase the size and diversity of a training dataset
 apply various transformations to the original data
 used when the original dataset is small or lacks diversity.
Why ?
 Prevents overfitting by exposing the model to more varied
data.
 Improves the model's ability to generalize to unseen data.
 Enhances performance in tasks like image classification,
object detection, natural language processing, etc. 60
Data augmentation

Common Techniques:
1. Image Data:
1. Rotation, flipping, cropping, scaling, and translation.
2. Color jittering (adjusting brightness, contrast, saturation).
3. Adding noise or blurring.
4. Random erasing or cutout.
2. Text Data:
1. Synonym replacement, random insertion, or deletion of words.
2. Back-translation (translating text to another language and back).
3. Shuffling sentences or phrases.
3. Audio Data:
1. Time stretching, pitch shifting, or adding background noise.
4. Tabular Data:
1. Adding noise to numerical features.
61
2. Synthetic minority oversampling techniques (e.g., SMOTE).

Data augmentation

 Examples for Image Data

62
Imbalanced datasets

 Skewed class distributions can lead to biased

models that favor the majority class.
 Common Techniques: Resampling Techniques
• Oversampling:
- Increase the number of instances in the minority
class.
- Examples: Random oversampling, SMOTE
(Synthetic Minority Oversampling Technique),
ADASYN.
• Undersampling:
- Reduce the number of instances in the majority
class.
- Examples: Random undersampling, Tomek links,
Cluster Centroids.
• Hybrid Approaches:
- Combine oversampling and undersampling for
balanced results.

Prerequisites

• Knowledge in numerical analysis: derivation calculation,

partial derivative, gradient, integral, etc.

• Knowledge of linear algebra: matrix, vector, norm, scalar

product, etc.

• Knowledge in probabilities & statistics

• Knowledge of programming

64
References
• A. Geron. Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow Concepts, Tools, and Techniques to
Build Intelligent Systems. O'Reilly Media Inc., 2019.
• C. Bishop. Pattern Recognition and Machine Learning.
Springer 2006.
• R. Duda, P. Storck and D. Hart. Pattern Classification.
Prentice Hall, 2002).
• ...

Hybrid Deep Neural Network Using Transfer Learning For EEG Motor Imagery
No ratings yet
Hybrid Deep Neural Network Using Transfer Learning For EEG Motor Imagery
7 pages
cp4252 Machine Learning
100% (1)
cp4252 Machine Learning
49 pages
Edgardfreitas 2016
No ratings yet
Edgardfreitas 2016
100 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Unit1-2
No ratings yet
Unit1-2
101 pages
mlintro-3
No ratings yet
mlintro-3
28 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
mlintro-4
No ratings yet
mlintro-4
28 pages
1.Introduction
No ratings yet
1.Introduction
24 pages
mlintro-2
No ratings yet
mlintro-2
28 pages
Module 1
No ratings yet
Module 1
175 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
1. U1 ML Intro and Applications
No ratings yet
1. U1 ML Intro and Applications
123 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Week 01
No ratings yet
Week 01
37 pages
Lecture01 Introduction To Machine Learning (Chapter1)
No ratings yet
Lecture01 Introduction To Machine Learning (Chapter1)
64 pages
Tirth.pdf
No ratings yet
Tirth.pdf
19 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
MCA -ML Question Bank Answer
No ratings yet
MCA -ML Question Bank Answer
139 pages
Machine Learning - Unit - 1
100% (1)
Machine Learning - Unit - 1
58 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
CS3491-AI ML-Chapter 1
No ratings yet
CS3491-AI ML-Chapter 1
19 pages
LN ML Rug
No ratings yet
LN ML Rug
267 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
39 pages
1 Sup
No ratings yet
1 Sup
80 pages
01 Introduction 1
No ratings yet
01 Introduction 1
71 pages
Module1_ Deep Learning
No ratings yet
Module1_ Deep Learning
26 pages
ML1
No ratings yet
ML1
34 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
ML Overview
No ratings yet
ML Overview
26 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
W1_ Introduction to ML
No ratings yet
W1_ Introduction to ML
57 pages
Lec 2 Basics of machine learning (1)
No ratings yet
Lec 2 Basics of machine learning (1)
35 pages
Unit 1 1. Define Machine Learning. Application of Machine Learning Applications of ML
No ratings yet
Unit 1 1. Define Machine Learning. Application of Machine Learning Applications of ML
40 pages
Lesson 4 -Introduction Machine Learning
No ratings yet
Lesson 4 -Introduction Machine Learning
44 pages
ppt4dl
No ratings yet
ppt4dl
81 pages
Machine Learning: What Is Data and Model? Machine Learning Workflow Distance Based Classifiers Bayes Decision Theory
No ratings yet
Machine Learning: What Is Data and Model? Machine Learning Workflow Distance Based Classifiers Bayes Decision Theory
81 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
01 Introduction
No ratings yet
01 Introduction
43 pages
Module 1
No ratings yet
Module 1
22 pages
Module 1-Basics of ML
No ratings yet
Module 1-Basics of ML
142 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
21CSC305P ML_ Unit 1-E.pptx
No ratings yet
21CSC305P ML_ Unit 1-E.pptx
137 pages
CE880_lecture5_slides
No ratings yet
CE880_lecture5_slides
32 pages
ML 23 First Lectures 2 3 v0.1
No ratings yet
ML 23 First Lectures 2 3 v0.1
66 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Artificial Neural Networks and Machine Learning - ICANN 2018
No ratings yet
Artificial Neural Networks and Machine Learning - ICANN 2018
854 pages
Radial Basis Function Networks: Applications: Introduction To Neural Networks: Lecture 14
No ratings yet
Radial Basis Function Networks: Applications: Introduction To Neural Networks: Lecture 14
14 pages
Avcce QB Aml JSD - Ice
No ratings yet
Avcce QB Aml JSD - Ice
10 pages
(GOOD) Faster R-CNN and YOLO Based Vehicle Detection A Survey
No ratings yet
(GOOD) Faster R-CNN and YOLO Based Vehicle Detection A Survey
7 pages
EPGP in Data Science Gen AI PDF
No ratings yet
EPGP in Data Science Gen AI PDF
63 pages
Leaf Disease Detection Using Machine Learning and Python
No ratings yet
Leaf Disease Detection Using Machine Learning and Python
9 pages
2 - Neural Network
100% (1)
2 - Neural Network
59 pages
Study Guide for Exam AI-900_ Microsoft Azure AI Fundamentals
No ratings yet
Study Guide for Exam AI-900_ Microsoft Azure AI Fundamentals
11 pages
Architecture Handbook
No ratings yet
Architecture Handbook
19 pages
Set 1
No ratings yet
Set 1
5 pages
A Hybrid Convolution Transformer For Hyperspectral Image Classification
No ratings yet
A Hybrid Convolution Transformer For Hyperspectral Image Classification
17 pages
2656-Article Text-4990-1-10-20160531
No ratings yet
2656-Article Text-4990-1-10-20160531
5 pages
Bangla
No ratings yet
Bangla
10 pages
Scm201805007_kipo_kipo’s Plan for Ai
No ratings yet
Scm201805007_kipo_kipo’s Plan for Ai
20 pages
Yushan Zhao Et Al - 2020 - An Effective Automatic System Deployed in Agricultural Internet of Things Using
No ratings yet
Yushan Zhao Et Al - 2020 - An Effective Automatic System Deployed in Agricultural Internet of Things Using
9 pages
ssrn_id3380834_code3457479_240609_192018
No ratings yet
ssrn_id3380834_code3457479_240609_192018
6 pages
Style Transfer Guide Presentation
No ratings yet
Style Transfer Guide Presentation
12 pages
A Review of Generative Adversarial Networks GANs and Its Applications in A Wide Variety of Disciplines From Medical To Remote Sensing
No ratings yet
A Review of Generative Adversarial Networks GANs and Its Applications in A Wide Variety of Disciplines From Medical To Remote Sensing
28 pages
Alexnet: The Architecture That Challenged Cnns
No ratings yet
Alexnet: The Architecture That Challenged Cnns
6 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
58 pages
Question Bank - REINFORCEMENT LEARNING
100% (2)
Question Bank - REINFORCEMENT LEARNING
2 pages
Artificial Intelligence Python (Pytorch) : Asia International University
No ratings yet
Artificial Intelligence Python (Pytorch) : Asia International University
4 pages
Final Exam Update Huawei
0% (1)
Final Exam Update Huawei
13 pages
Exam Preparation Tips for AI students
No ratings yet
Exam Preparation Tips for AI students
3 pages
CSE 473 Pattern Recognition
No ratings yet
CSE 473 Pattern Recognition
45 pages
Comprehensive_Pattern_Recognition_Lecture_Notes
No ratings yet
Comprehensive_Pattern_Recognition_Lecture_Notes
12 pages
Base Paper (YOLO)
No ratings yet
Base Paper (YOLO)
6 pages
SPPU TE AIDS Artificial Intelligence Sep 2023
100% (1)
SPPU TE AIDS Artificial Intelligence Sep 2023
1 page