ML Syllabus First 5 Topics Hints Visuals
ML Syllabus First 5 Topics Hints Visuals
Learning - Part 1
• Supervised learning = learning with labels.
• You are given questions (inputs) and answers (outputs).
• Goal: Learn a mapping from input to output.
• Used in spam detection, weather forecasting, etc.
• Example: Predicting house price from area, location.
Introduction to Supervised
Learning - Part 2
• Two key types: Regression (predict values), Classification (predict
categories).
• It’s like a teacher giving you solved examples.
• You learn from known data, and predict unknown.
• Data = (Features, Label).
• Supervised learning is the most used type of ML.
Introduction to Supervised
Learning - Part 3
• Example Dataset: Features = [Area, Bedrooms], Label = Price.
• Model learns patterns in training data.
• Test data checks how well it generalizes.
• Accuracy depends on quality and quantity of data.
• You train first, then test later!
Introduction to Supervised
Learning - Part 4
• Real-life: Face detection, voice recognition, medical diagnosis.
• Learning stops after training in basic supervised methods.
• There are many algorithms under this: Linear Regression, SVM, etc.
• Common toolkits: Scikit-learn, TensorFlow, PyTorch.
• Start with simple models, move to complex ones as needed.
Introduction to Supervised
Learning - Part 5
• Key Terms: Features, Labels, Training set, Test set.
• Model performance: Accuracy, Loss, Error rate.
• Bias = underfitting, Variance = overfitting.
• Evaluation helps choose best model.
• Let’s explore types of models next!
Discriminative and Generative
Models - Part 1
• Discriminative models: Focus on boundaries between classes.
• They learn P(y|x): Given input, what’s the output?
• Examples: Logistic Regression, SVM, Neural Networks.
• Used when classification accuracy is key.
• Fast to train and test.
Discriminative and Generative
Models - Part 2
• Generative models: Learn how data is generated.
• They learn P(x, y) or P(x|y): Full data distribution.
• Examples: Naive Bayes, Hidden Markov Models.
• Can generate new data too!
• Often better with smaller datasets.
Discriminative and Generative
Models - Part 3
• Comparison: Discriminative = boundaries, Generative = distribution.
• Discriminative is like a judge deciding winner.
• Generative is like a storyteller describing all players.
• Discriminative often more accurate for classification.
• Generative more flexible and interpretable.
Discriminative and Generative
Models - Part 4
• Use Case: Spam classification.
• Discriminative: Focus only on spam/not-spam line.
• Generative: Tries to model how spam and non-spam look.
• Both have strengths, choose based on use case.
• Try both in practice for comparison.
Discriminative and Generative
Models - Part 5
• Fun Fact: GANs are Generative Adversarial Networks.
• They combine generative + discriminative training.
• Used in image generation, art creation, etc.
• Understanding these basics helps in deep learning too!
• Next up: Regression techniques.
Linear Regression & Least Squares -
Part 1
• Linear Regression: Predicting continuous values.
• Best fit line: y = mx + c.
• Least Squares: Minimizes the squared difference between actual and
predicted values.
• It’s like drawing the most balanced line through data points.
• Used in: predicting prices, temperature, etc.
Linear Regression & Least Squares -
Part 2
• Feature = input variable, Label = output.
• Training: Find best 'm' and 'c' (slope and intercept).
• Predictions get better as more patterns are learned.
• Assumes linear relationship between input and output.
• Easy to implement and interpret.
Linear Regression & Least Squares -
Part 3
• Limitations: Can’t model curves or complex patterns.
• Sensitive to outliers.
• You can extend it to multiple variables: Multiple Linear Regression.
• Still widely used in many business applications.
• Foundation for more advanced models.
Linear Regression & Least Squares -
Part 4
• Error = Actual - Predicted.
• Squared Error helps penalize large mistakes.
• Goal: Minimize total squared error.
• Gradient Descent: Popular method to find the best line.
• Helps in understanding optimization.
Linear Regression & Least Squares -
Part 5
• Visual: Data points scattered with a line through them.
• Closer the points are to the line, better the model.
• Line can be used to predict new unknown points.
• Evaluation: R² score tells how good the model is.
• Let’s look at model problems: Under/Overfitting next.
Underfitting / Overfitting & Cross-
Validation - Part 1
• Underfitting: Model too simple to learn patterns.
• Overfitting: Model too complex, memorizes everything.
• Balance is key for good performance.
• Training error ≠ Testing error.
• Goal: Generalize well to unseen data.
Underfitting / Overfitting & Cross-
Validation - Part 2
• Underfit example: Straight line for a curve-shaped data.
• Overfit example: Wiggly line matching every point.
• Good model = not too simple, not too complex.
• Bias = error due to simplicity.
• Variance = error due to complexity.
Underfitting / Overfitting & Cross-
Validation - Part 3
• Cross-validation: Split data into parts.
• Train on one part, validate on another.
• K-Fold CV: Split into k parts, rotate training/validation.
• Helps detect overfitting before testing.
• More reliable than single train-test split.
Underfitting / Overfitting & Cross-
Validation - Part 4
• Use validation scores to pick best model.
• Avoids depending on lucky data split.
• Used in all modern ML workflows.
• Tools like scikit-learn make it easy.
• Can also help tune hyperparameters.
Underfitting / Overfitting & Cross-
Validation - Part 5
• Diagram: Underfit = flat line, Overfit = wavy, Good fit = smooth.
• Validation curve: Shows training vs validation error.
• Sweet spot = where both errors are low.
• Cross-validation helps find it.
• Next: Make models simpler using Lasso Regression.
Lasso Regression - Part 1