Supervised Machine Learning
Supervised Machine Learning
Learning
By Dr. Raivrajsinh S. Vaghela
Outline
• Basics of Supervised Learning
• Prediction
• Classification
• Understanding Datasets
• Feature Selection
• Feature Normalization
• Data Cleaning
• Training, Testing & Validation Sets
Basics of Supervised Learning
• This training dataset includes inputs and correct outputs, which allow
the model to learn over time.
• Some datasets are both large and diverse. However, some datasets are large but have low
diversity, and some are small but highly diverse. In other words, a large dataset doesn’t
guarantee sufficient diversity, and a dataset that is highly diverse doesn't guarantee
sufficient examples.
• For instance, a dataset might contain 100 years worth of data, but only for the month of July.
Using this dataset to predict rainfall in January would produce poor predictions. Conversely,
a dataset might cover only a few years but contain every month. This dataset might produce
poor predictions because it doesn't contain enough years to account for variability.
Characterized
• A dataset can also be characterized by the number of its features. For
example, some weather datasets might contain hundreds of features,
ranging from satellite imagery to cloud coverage values.
• Other datasets might contain only three or four features, like
humidity, atmospheric pressure, and temperature.
• Datasets with more features can help a model discover additional
patterns and make better predictions.
• However, datasets with more features don't always produce models
that make better predictions because some features might have no
causal relationship to the label.
Understanding of Dataset context of
Supervised learning.
Model generation from Labeled Example
• In supervised learning, a model is the complex collection of numbers
that define the mathematical relationship from specific input feature
patterns to specific output label values. The model discovers these
patterns through training.