ml_unit_4 (1)
ml_unit_4 (1)
Linear Discriminant
A Linear Discriminant is a line (in 2D), a plane (in 3D), or a hyperplane (in higher dimensions)
used to separate two or more classes of data points. It tries to find the best line or boundary
that separates the classes as clearly as possible.
It is mainly used in classification problems, where we want to predict which class a data point
belongs to.
Simple Example:
Imagine you're trying to separate two types of fruit based on their features:
Weight
Color score (a number representing how red/orange it is)
X-axis: weight
Y-axis: color score
You might see the apples clustering in one area, and the oranges in another.
Now, Linear Discriminant Analysis (LDA) tries to find a straight line that best separates these
two clusters — so that we can say:
Key Idea:
Linear discriminants maximize the separation between classes while minimizing the spread
(variance) within each class.
machine learning unit iv
Think of it like drawing a straight line (or plane in higher dimensions) that
best separates the categories you want to classify.
How It Works (Simple Steps):
1. Calculate the mean of each class (e.g., mean weight and height
for apples and oranges).
3. Compute the scatter between the classes (how far apart the
class means are).
5. Project all data onto this line and set a threshold to classify new
data.
machine learning unit iv
Weight
Color score
Class 0: Apples
Class 1: Oranges
LDA will find the best line in this 2D space so that when a new fruit appears,
it can be projected onto the line, and based on where it falls, it can be
classified as an apple or an orange.
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
Y_pred = lda.predict(X_test)
Print(“Accuracy:”, accuracy_score(y_test, y_pred))
Key Points:
Perceptron classifier
It learns by adjusting its internal rules (called weights) until it can correctly
say which input belongs to which group.
1. The perceptron takes in numbers (like 150 grams and red color = 1).
2. It multiplies them by weights and adds them up.
3. If the total is above a certain point, it says “apple”; if not, it says
“orange.”
4. If it's wrong, it adjusts its weights a little bit.
5. Repeat until it gets really good at it.
machine learning unit iv
Let's say we are training it to say if something is a cat (1) or not a cat (0)
based on features like ears, whiskers, size, etc.
Step 1: Initialize
1. Input the features: like [1, 0, 1] (maybe ears = yes, tail = no,
whiskers = yes).
2. Make a prediction:
o Multiply each input by its weight.
o Add them all up + bias.
o If the result is greater than 0, guess “cat” (1); else “not a cat”
(0).
3. Compare to actual answer (truth).
4. Update the weights and bias if it was wrong:
machine learning unit iv
Learn from mistakes by slightly adjusting how important each input is.
Example:
Input 1 Input 2 Expected Output
0 0 0
0 1 0
1 0 0
1 1 1
Initial weights: w1 = 0, w2 = 0
Bias: 0
Learning rate: 1
Prediction:
(0×0) + (0×0) + 0 = 0 → output = 0 ✅ (Correct, no update)
Prediction:
(0×0) + (1×0) + 0 = 0 → output = 0 ✅ (Correct, no update)
Prediction:
(1×0) + (0×0) + 0 = 0 → output = 0 ✅ (Correct, no update)
machine learning unit iv
Simple Analogy:
Imagine you’re a teacher and you want to divide students into two groups
based on their scores in Math and Science:
Now, your goal is to draw a line (in 2D) that clearly separates the two groups.
SVM finds the line that not only separates the groups but also stays as far
away as possible from the closest points of each group. This line is called the
maximum margin hyperplane.
Key Concepts:
Support Vectors: The closest data points to the hyperplane. These are
crucial for defining the boundary.
Margin: Distance between the hyperplane and the nearest points from each
class. SVM tries to maximize this.
machine learning unit iv
Real Example:
1.Number of links
SVM will look at this data and find a boundary that separates spam emails
from non-spam ones, using the support vectors to make the decision
boundary as clear as possible.
Real-Life Analogy:
Imagine you’re trying to separate apples and oranges placed on a table. But
this time, they’re arranged in a circular pattern:
Think of it like lifting the oranges up (into 3D) so now you can cut them apart
with a flat sheet (a plane) instead of a line.
Popular Kernels:
RBF (Radial Basis Function) kernel: For complex boundaries (most commonly
used)
In 2D, no line can separate them. But using a RBF kernel, SVM can transform
the data and draw a separating surface in higher dimensions.
Non-linear SVM is used when your data can’t be separated with a straight
line (in 2D) or a flat plane (in higher dimensions).
Maps it back to the original space, where it looks like a curve or more
complex boundary
Kernel Trick
The kernel trick is a smart math trick that lets us solve problems where
data can’t be separated with a straight line — by pretending we’re
working in a higher-dimensional space without actually going there!
People wearing red shirts and blue shirts are mixed together in
the room (2D).
You want to separate them, but you can’t draw a line on the floor to
split them.
You then draw a flat floor (a plane) in this 3D room that separates them
easily.
machine learning unit iv
🎉 The kernel trick lets a computer do this kind of jump to higher dimensions
without actually calculating the elevator ride. It uses a function
(kernel) to fake it.
Linear Regression
Linear regression is a statistical method used to model the relationship
between a dependent variable and one or more independent variables,
assuming that the relationship is linear
Simple Idea:
"If I know how much something costs today, can I predict what it
might cost tomorrow?"
Real-Life Example
Imagine you're a fruit seller 🍌🍎
You track:
1 100
2 150
3 200
y=mx+b
Where:
y = prediction (sales)
x = input (days)
m = slope (how much y increases with x)
b = y-intercept (where the line starts)
Term Meaning
Linear It's a straight line
Regression You're predicting a number
Input (x) The thing you know (like day number)
Output (y) The thing you want to predict (sales)
Slope (m) How steep the line is
machine learning unit iv
Term Meaning
Intercept (b) Where the line crosses y-axis
import numpy as np
# Data
model = LinearRegression()
model.fit(X, y)
prediction = model.predict([[4]])
Logistic Regression
Real-Life Example
25 60kg No (0)
model = LogisticRegression()
model.fit(X, y)
Multilayer Perceptron
A Multilayer Perceptron (MLP) is a type of neural network — it's like a
brain-inspired system that helps computers learn complex things.
If a single perceptron can only draw straight lines (like "is this an apple or
not?"), an MLP can learn to draw curves, shapes, and patterns — like "is
this a cat, dog, or hamster?"
1. Input Layer – takes in the data (like age, height, pixels, etc.)
2. Hidden Layer(s) – the “thinking” part where patterns are learned
3. Output Layer – gives the final result (like Yes/No, or which class)
This repeats again and again until it gets really good at the task.
Real-Life Example
Imagine you want to build an app that says whether a picture is a cat, dog,
or rabbit.
Meanin
Term
g
MLP : A deep neural network with layers
Hidden Layer: Middle part where learning happens
Activation : Adds power to learn curves & patterns
Backpropagation : How it fixes mistakes and learns
Use cases : Image recognition, voice, spam, etc.
X = [[150, 50], [160, 60], [170, 80], [180, 90]] # Features: [Height, Weight]
model.fit(X, y)
prediction = model.predict(new_person)
Output:
It’s like when you get a quiz question wrong, then look at the answer and fix
your thinking.
Backpropagation helps the MLP do the same — but with math 🧮
2 inputs
1 hidden layer with 2 neurons
1 output neuron (predicts 0 or 1)
machine learning unit iv
machine learning unit iv
Real-life Analogy: