0% found this document useful (0 votes)
2 views

Unit-1

The document outlines a comprehensive course on machine learning, covering various units including an introduction to machine learning, supervised and unsupervised learning, probabilistic learning, ensemble methods, and reinforcement learning. It also discusses real-world applications of machine learning across different sectors such as healthcare, finance, and e-commerce. Additionally, the document explains the differences between artificial intelligence, machine learning, and deep learning, along with the advantages and disadvantages of various learning methods.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Unit-1

The document outlines a comprehensive course on machine learning, covering various units including an introduction to machine learning, supervised and unsupervised learning, probabilistic learning, ensemble methods, and reinforcement learning. It also discusses real-world applications of machine learning across different sectors such as healthcare, finance, and e-commerce. Additionally, the document explains the differences between artificial intelligence, machine learning, and deep learning, along with the advantages and disadvantages of various learning methods.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Machine Learning

Unit 1
Course outlines:
Unit I: Introduction to machine learning:
Introduction – Learning, Types of Learning, Well defined learning problems, Designing a Learning System, History of ML, Introduction of
Machine Learning Approaches, Introduction to Model Building, Sensitivity Analysis, Underfitting and Overfitting, Bias and Variance,
Concept Learning Task, Find – S Algorithms, Version Space and Candidate Elimination Algorithm, Inductive Bias, Issues in Machine
Learning and Data Science Vs Machine Learning.
Unit-II Mining association and supervised learning:
Classification and Regression, Regression: Linear Regression, Multiple Linear Regression, Logistic Regression, Polynomial Regression,
Decision Trees: ID3, C4.5, CART. Apriori Algorithm: Market basket analysis, Association Rules. Neural Networks: Introduction,
Perceptron, Multilayer Perceptron, Support vector machine.
UNIT-III UNSUPERVISED LEARNING:
Introduction to clustering, K-means clustering, K-Nearest Neighbor, Iterative distance-based clustering, Dealing with continuous,
categorical values in K-Means, Hierarchical: AGNES, DIANA, Partitional: K-means clustering, K-Mode Clustering, density-based
clustering, Expectation Maximization, Gaussian Mixture Models.
UNIT-IV PROBABILISTIC LEARNING & ENSEMBLE
Bayesian Learning, Bayes Optimal Classifier, Naıve Bayes Classifier, Bayesian Belief Networks. Ensembles methods: Bagging &
boosting, C5.0 boosting, Random Forest, Gradient Boosting Machines and XGBoost.
UNIT-V REINFORCEMENT LEARNING & CASE STUDIES
Reinforcement Learning: Introduction to Reinforcement Learning, Learning Task, Example of Reinforcement Learning in Practice,
Learning Models for Reinforcement – (Markov Decision process, Q Learning – Q Learning function, QLearning Algorithm), Application
of Reinforcement Learning.
Case Study: Health Care, E Commerce, Smart Cities
Introduction: Machine learning
▪ Machine learning is a subfield of artificial intelligence that focuses on the development of
algorithms that can learn patterns and make predictions based on what it has learnt from data.
▪ Aim: Analyse data to discover patterns, make predictions, or automate tasks.

Fig 1: Machine Learning [1]


Applications of machine learning:
There are various real-world applications of machine learning with explanations.
1.Healthcare: Disease Diagnosis and Predictions
• Machine learning is used for diagnosing medical conditions and predicting disease
outcomes. For example, ML algorithms can analyze patient data to identify patterns
associated with specific diseases, enabling early diagnosis and personalized treatment
plans.
2.Finance: Fraud Detection
• In the financial industry, machine learning is employed for detecting fraudulent activities.
Algorithms analyze transaction data and user behavior to identify unusual patterns,
helping financial institutions prevent and mitigate fraud.
3.E-commerce: Recommendation Systems
• Many e-commerce platforms leverage machine learning to provide personalized product
recommendations. Algorithms analyze user behavior, purchase history, and preferences to
suggest products, enhancing the user experience and increasing sales.
4.Autonomous Vehicles: Computer Vision
• Machine learning plays a crucial role in autonomous vehicles for tasks such as object
detection, lane tracking, and decision-making. Computer vision algorithms enable
vehicles to interpret and respond to their surroundings, ensuring safe navigation.
Applications of machine learning:
5. Natural Language Processing (NLP): Chatbots and Virtual Assistants
• NLP, a subset of machine learning, is used to develop chatbots and virtual assistants that can understand and respond to
natural language queries. These applications are widely used in customer service, providing immediate and automated
support.
6. Manufacturing: Predictive Maintenance
• Predictive maintenance utilizes machine learning to analyze sensor data from machinery and equipment. By identifying
patterns indicative of potential failures, manufacturers can schedule maintenance proactively, reducing downtime and
extending the lifespan of equipment.
7. Marketing: Customer Segmentation
• Machine learning is employed in marketing for customer segmentation and targeted advertising. Algorithms analyze
customer behavior and preferences to group individuals with similar characteristics, allowing businesses to tailor marketing
strategies more effectively.
8. Energy: Smart Grids and Consumption Prediction
• In the energy sector, machine learning is used for optimizing energy distribution in smart grids. ML algorithms analyze data
to predict energy consumption patterns, helping utilities manage resources efficiently and reduce waste.
9. Human Resources: Talent Acquisition
• Machine learning is applied in human resources for talent acquisition and recruitment. Algorithms analyze resumes and
candidate profiles to match them with job requirements, streamlining the hiring process and improving the quality of
candidate selection.
10. Agriculture: Crop Monitoring and Yield Prediction
• Machine learning is used in agriculture for tasks such as crop monitoring, disease detection, and yield prediction. By
analyzing data from sensors, satellites, and other sources, farmers can make informed decisions to optimize crop production.
AI, ML, and DL
AI, ML, and DL
Artificial intelligence, commonly referred to as AI, is the process of imparting data, information, and human
intelligence to machines. The main goal of Artificial Intelligence is to develop self-reliant machines that can think
and act like humans. These machines can mimic human behavior and perform tasks by learning and problem-
solving. Most of the AI systems simulate natural intelligence to solve complex problems.
Let’s have a look at an example of an AI-driven product - Amazon Echo.
Artificial Intelligence:

Amazon Echo is a smart speaker that uses Alexa, the virtual assistant AI technology developed by
Amazon. Amazon Alexa is capable of voice interaction, playing music, setting alarms, playing
audiobooks, and giving real-time information such as news, weather, sports, and traffic reports.

As you can see in the illustration below, the person wants to know the current temperature in
Chicago. The person’s voice is first converted into a machine-readable format. The formatted
data is then fed into the Amazon Alexa system for processing and analyzing. Finally, Alexa
returns the desired voice output via Amazon Echo.
AI, ML and DL

• Artificial Intelligence is the concept of creating smart intelligent machines.

• Machine Learning is a subset of artificial intelligence that helps you build AI-driven applications.

• Deep Learning is a subset of machine learning that uses vast volumes of data and complex algorithms to train
a model.
Difference between AI, ML and DL
Aspect Artificial Intelligence (AI) Machine Learning (ML) Deep Learning (DL)

Subset of AI that enables systems to learn Subfield of ML that uses neural networks
Intelligence demonstrated by machines, aiming to mimic
Definition and make predictions without explicit with multiple layers to model and process
human cognitive functions.
programming. complex data representations.

Image and speech recognition, Natural


Spam email filters, Image recognition,
Example Siri, Google Assistant, Chess-playing computers. language processing, Autonomous
Recommender systems.
vehicles.

Highly dependent on large labeled


Data Dependency May or may not require data. Requires labeled data for training models.
datasets.

Complexity varies with algorithms and Complex models capable of capturing


Complexity Can be rule-based or involve complex reasoning.
tasks. intricate patterns.

Training time depends on the dataset and Can be computationally intensive,


Training Time Training may not involve explicit learning from data.
algorithm complexity. requiring significant time for training.

Image recognition, Speech recognition, Image and speech recognition, Language


Applications Natural language processing, Expert systems, Robotics.
Fraud detection. translation, Autonomous vehicles.
Machine learning types:

➢ Supervised Learning:

➢ Unsupervised Learning:

➢ Semi-Supervised Learning:

➢ Reinforcement Learning:

➢ Transfer Learning:

➢ Deep Learning:
Supervised learning:

• Supervised learning involves training a model using labeled data.


• Labeled data means, where each data point is associated with a known target or output.
• The goal is to develop a mapping function that can develop relationship between input and output.
• It can then make predictions or classify new, unseen data based on what it has learned.
• It can be used for both classification and regression purpose.

Table:1 Labeled dataset

Data Age Gender Income Purchased


point
1 25 Male $40,000 Yes

2 30 Female $35,000 No

3 22 Male $28,000 Yes

4 28 Female $45,000 No

5 35 Male $50,000 Yes


How supervised learning works?

Fig 2: Supervised learning process [2]


Steps involved in Supervised machine learning:
➢ Dataset Creation: A labeled dataset is created, comprising input features (also called predictors or
independent variables) and their corresponding output labels (also called dependent variables or
targets).

➢ Model Training: The labeled dataset is used to train a model using various algorithms such as
regression, decision trees, random forests, support vector machines, or neural networks. The model
learns the underlying patterns and relationships between the input features and the output labels.

➢ Model Evaluation: The trained model is evaluated using separate test data that was not used during
the training phase. Evaluation metrics such as accuracy, precision, recall, or mean squared error are
commonly used to assess the model's performance.

➢ Prediction: Once the model is trained and validated, it can be used to make predictions on new, unseen
data by providing the input features, and the model generates the corresponding output based on its
learned patterns.
Data pre-processing:
➢ Pre-processing includes a number of techniques and actions:

➢ Data cleaning: These techniques, manual and automated, remove data incorrectly added or classified.

➢ Data imputations: Most ML frameworks include methods and APIs for balancing or filling in missing data. Techniques
generally include imputing missing values with standard deviation, mean, median and k-nearest neighbors (k-NN) of
the data in the given field.

➢ Oversampling: Bias or imbalance in the dataset can be corrected by generating more observations/samples with
methods like repetition.

➢ Data integration: Combining multiple datasets to get a large corpus can overcome incompleteness in a single
dataset.

➢ Data normalization: The size of a dataset affects the memory and processing required for iterations during training.
Normalization reduces the size by reducing the order and magnitude of data.
• Advantages of Supervised Machine Learning
• Supervised Learning models can have high accuracy as they are trained
on labelled data.
• The process of decision-making in supervised learning models is often
interpretable.
• It can often be used in pre-trained models which saves time and
resources when developing new models from scratch.
• Disadvantages of Supervised Machine Learning
• It has limitations in knowing patterns and may struggle with unseen or
unexpected patterns that are not present in the training data.
• It can be time-consuming and costly as it relies on labeled data only.
• It may lead to poor generalizations based on new data
• Applications of Supervised Learning
• Supervised learning is used in a wide variety of applications, including:
• Image classification: Identify objects, faces, and other features in images.
• Natural language processing: Extract information from text, such as sentiment, entities, and relationships.
• Speech recognition: Convert spoken language into text.
• Recommendation systems: Make personalized recommendations to users.
• Predictive analytics: Predict outcomes, such as sales, customer churn, and stock prices.
• Medical diagnosis: Detect diseases and other medical conditions.
• Fraud detection: Identify fraudulent transactions.
• Autonomous vehicles: Recognize and respond to objects in the environment.
• Email spam detection: Classify emails as spam or not spam.
• Quality control in manufacturing: Inspect products for defects.
• Credit scoring: Assess the risk of a borrower defaulting on a loan.
• Gaming: Recognize characters, analyze player behavior, and create NPCs.
• Customer support: Automate customer support tasks.
• Weather forecasting: Make predictions for temperature, precipitation, and other meteorological parameters.
• Sports analytics: Analyze player performance, make game predictions, and optimize strategies.
Unsupervised Learning:
▪ Unsupervised learning deals with unlabeled data.

▪ The model learns to identify similarities, differences, or groupings in the data based on its intrinsic properties.

▪ The objective is to discover the underlying structure or patterns within the data without any predefined output labels.

▪ Common tasks in unsupervised learning include clustering similar data points together or dimensionality reduction to
identify important features.

Table 2: Unlabeled data points

Data Age Gender Income Purchased?


point
1 25 Male $40,000
2 30 Female $35,000
3 22 Male $28,000
4 28 Female $45,000
5 35 Male $50,000
Working of Unsupervised machine learning:

Fig 3: Working of Unsupervised machine learning [3]


• Advantages of Unsupervised Machine Learning
• It helps to discover hidden patterns and various relationships between
the data.
• Used for tasks such as customer segmentation, anomaly
detection, and data exploration.
• It does not require labeled data and reduces the effort of data labeling.
• Disadvantages of Unsupervised Machine Learning
• Without using labels, it may be difficult to predict the quality of the
model’s output.
• Cluster Interpretability may not be clear and may not have meaningful
interpretations.
• It has techniques such as autoencoders and dimensionality
reduction that can be used to extract meaningful features from raw data
• Applications of Unsupervised Learning
• Here are some common applications of unsupervised learning:
• Clustering: Group similar data points into clusters.
• Anomaly detection: Identify outliers or anomalies in data.
• Dimensionality reduction: Reduce the dimensionality of data while preserving its essential
information.
• Recommendation systems: Suggest products, movies, or content to users based on their historical
behavior or preferences.
• Topic modeling: Discover latent topics within a collection of documents.
• Density estimation: Estimate the probability density function of data.
• Image and video compression: Reduce the amount of storage required for multimedia content.
• Data preprocessing: Help with data preprocessing tasks such as data cleaning, imputation of missing
values, and data scaling.
• Market basket analysis: Discover associations between products.
• Genomic data analysis: Identify patterns or group genes with similar expression profiles.
• Image segmentation: Segment images into meaningful regions.
• Community detection in social networks: Identify communities or groups of individuals with similar
interests or connections.
• Customer behavior analysis: Uncover patterns and insights for better marketing and product
recommendations.
• Content recommendation: Classify and tag content to make it easier to recommend similar items to
users.
Steps involved to develop unsupervised model:

1. Dataset Preparation: An unlabeled dataset is collected, consisting only of input features without any
corresponding output labels.

2. Model Training: The model is trained on the unlabeled data using algorithms such as clustering,
dimensionality reduction, or generative models. The model identifies patterns, relationships, or groupings in
the data based on statistical properties or other measures of similarity.

3. Model Evaluation: Unsupervised learning models are evaluated based on internal metrics such as cohesion,
separation, or reconstruction error. Domain-specific evaluation measures can also be utilized, depending on
the task.

4. Knowledge Extraction: Once the model is trained, it can be used to gain insights, find anomalies, or create
representations that aid in downstream tasks, such as data visualization, anomaly detection, or feature
extraction.
Semi-supervised learning:
▪ Semi-supervised machine learning is a learning process that combines elements of both supervised and
unsupervised learning.
▪ In this approach, a model is trained on a dataset that contains a mixture of labeled and unlabeled examples.
▪ The goal of semi-supervised learning is to leverage the information present in both labeled and unlabeled data
to improve the model's performance, especially when obtaining a large amount of labeled data is costly or
impractical.

Fig 4: Semi-supervised learning [4]


Semi-supervised learning:

Fig 5: Semi-supervised learning [4]


Semi-supervised learning:

▪ Suppose you have 100 labeled movie reviews (50 positive, 50 negative) and 1000 unlabeled movie reviews.

▪ You train a base sentiment analysis model using the labeled data.

▪ In the self-training phase, you use the base model to predict sentiment labels for the unlabeled reviews.

▪ You treat these predictions as pseudo-labels and incorporate them into the training set.

▪ In the co-training phase, you could train two separate sentiment analysis models on different types of features
(e.g., bag-of-words). These models exchange their predictions on the unlabeled data to enhance each other's
training.
• Advantages of Semi- Supervised Machine Learning
• It leads to better generalization as compared to supervised learning, as it
takes both labeled and unlabeled data.
• Can be applied to a wide range of data.
• Disadvantages of Semi- Supervised Machine Learning
• Semi-supervised methods can be more complex to implement compared
to other approaches.
• It still requires some labeled data that might not always be available or
easy to obtain.
• The unlabeled data can impact the model performance accordingly.
• Applications of Semi-Supervised Learning
• Here are some common applications of semi-supervised learning:
• Image Classification and Object Recognition: Improve the accuracy of models by
combining a small set of labeled images with a larger set of unlabeled images.
• Natural Language Processing (NLP): Enhance the performance of language models and
classifiers by combining a small set of labeled text data with a vast amount of unlabeled
text.
• Speech Recognition: Improve the accuracy of speech recognition by leveraging a limited
amount of transcribed speech data and a more extensive set of unlabeled audio.
• Recommendation Systems: Improve the accuracy of personalized recommendations by
supplementing a sparse set of user-item interactions (labeled data) with a wealth of
unlabeled user behavior data.
• Healthcare and Medical Imaging: Enhance medical image analysis by utilizing a small set
of labeled medical images alongside a larger set of unlabeled images.
Reinforcement learning:
▪ Reinforcement learning involves training an agent to interact with an environment and learn from the
feedback it receives.
▪ The agent learns through a trial-and-error process by taking actions and receiving rewards or penalties based
on its performance.
▪ The goal is to maximize the cumulative reward over time, leading to the development of optimal strategies or
policies.

Fig 6: Reinforcement learning process [5]


Example of reinforcement learning:
• Let's consider a simple example of training an RL agent to play a game like chess.
1. Agent: The RL agent is a computer program that will learn to play chess.
2. Environment: The environment is the chessboard and the rules of chess.
3. State: The state represents the current arrangement of pieces on the chessboard.
4. Action: Actions are the legal moves that the agent can make in the current state (e.g., moving a pawn, capturing an
opponent's piece).
5. Reward: The reward could be +1 if the agent wins the game, -1 if the agent loses, and 0 for draws or
intermediate states. The rewards shape the agent's learning by providing feedback on the desirability of its
actions.
6. Policy: The policy in this case is the strategy the agent uses to decide which move to make based on the current state.
It could be a set of rules or a learned policy represented by a neural network.
7. Value Function: The value function estimates how favorable a particular state is for the agent. It helps the agent make
decisions that maximize its long-term rewards.
8. Q-Function: The Q-function estimates the value of taking a specific action in a specific state, while considering the
agent's future actions.
Example of reinforcement learning:
• For an easier explanation, let’s take the example of a dog.
• We can train our dog to perform certain actions, of course, it won’t be an easy task. You would order the dog
to do certain actions and for every proper execution, you would give a biscuit as a reward. The dog will
remember that if it does a certain action, it would get biscuits. This way it will follow the instructions properly
next time.

• We can take another example, in this case, a human child.


• Kids often make mistakes. Adults try to make sure they learn from it and try not to repeat it again. In this
case, we can take the concept of feedbacks. If the parents are strict, they will scold the children for any
mistakes. This is a negative type of feedback. The child will remember it as if it does a certain wrong action,
the parents will scold the kid.
Why use Reinforcement learning?

• It helps you to find which situation needs an action.

• Helps you to discover which action yields the highest reward over the longer period.

• Reinforcement learning also provides the learning agent with a reward function.

• It also allows it to figure out the best method for obtaining large rewards.
Advantages and disadvantages of reinforcement learning:
Advantages:
• It can solve higher-order and complex problems. Also, the solutions obtained will be very accurate.
• This model will undergo a rigorous training process that can take time. This can help to correct any errors.
• Due to it’s learning ability, it can be used with neural networks. This can be termed as deep reinforcement learning.
• Since the model learns constantly, a mistake made earlier would be unlikely to occur in the future.
• The best part is that even when there is no training data, it will learn through the experience it has from processing the
training data.

Disadvantages:
• The use of reinforcement learning models for solving simpler problems won’t be correct. The reason being, the models
generally tackle complex problems.
• Reinforcement Learning models require a lot of training data to develop accurate results.
• This consumes time and lots of computational power.
• When it comes to building models on real-world examples, the maintenance cost is very high.
• Excessive training can lead to overloading of the states of the model.
• This may happen if too much memory space goes out in processing the training data.
Transfer learning (Pre-trained models):
▪ Transfer learning involves leveraging knowledge or models learned from one task or domain to improve performance on
another related task or domain.
▪ In transfer learning, pretrained models are commonly used. These are models that have been trained on a large dataset for a
different task, such as image classification on ImageNet (1.2million images with 1000 categories).
▪ These pretrained models capture general features that can be valuable for a variety of tasks.
▪ The pre-trained models are fine-tuned or adapted to the new problem with a smaller amount of task-specific data.
▪ pre-trained models are often used for image classification, object detection, natural language processing, and generative
modeling. By leveraging pre-trained models, we can reduce the need for extensive training on large datasets from scratch.

Fig 7: Transfer learning [6]


Transfer learning (Pre-trained models):
1. VGG (Visual Geometry Group): VGG16, VGG19

2. ResNet (Residual Network): ResNet50, ResNet101, ResNet52

3. Inception: InceptionV3, InceptionResNetV2

4. MobileNet: MobileNetV1, MobileNetV2, MobileNetV3

5. DenseNet (Densely Connected Convolutional Network): DenseNet121, DenseNet169, DenseNet201

6. EfficientNet: EfficientNetB0-EfficientNetB7

7. GPT (Generative Pre-trained Transformer): GPT2-5

8. YOLO (You Only Look Once): YOLOv3-YOLOv8

9. Xception
Advantages and disadvantages of transfer learning:
Advantages:
• Speed up the training process: By using a pre-trained model, the model can learn more quickly and effectively on the
second task, as it already has a good understanding of the features and patterns in the data.
• Better performance: Transfer learning can lead to better performance on the second task, as the model can leverage the
knowledge it has gained from the first task.
• Handling small datasets: When there is limited data available for the second task, transfer learning can help to prevent
overfitting, as the model will have already learned general features that are likely to be useful in the second task.

Disadvantages:
• Domain mismatch: The pre-trained model may not be well-suited to the second task if the two tasks are vastly different
or the data distribution between the two tasks is very different.
• Overfitting: Transfer learning can lead to overfitting if the model is fine-tuned too much on the second task, as it may
learn task-specific features that do not generalize well to new data.
• Complexity: The pre-trained model and the fine-tuning process can be computationally expensive and may require
specialized hardware.
Deep learning neural network:

Fig 8: Deep learning neural network [7]


Deep learning:
▪ Deep learning is a subfield of machine learning that focuses on training artificial neural networks with multiple layers
to learn and make predictions from complex data.
▪ There are several popular deep learning models, each with its own architecture and characteristics. Here's a brief
explanation of some of the commonly used deep learning models:
1. Feedforward Neural Networks (FNN): FNN, also known as multilayer perceptrons (MLPs), are the simplest form of
deep learning models. They consist of an input layer, one or more hidden layers, and an output layer. Information flows
in one direction, from the input layer through the hidden layers to the output layer. FNNs are effective for tasks like
image classification and natural language processing.
2. Convolutional Neural Networks (CNN): CNNs are primarily used for analyzing visual data, such as images and
videos. They employ convolutional layers that perform localized feature extraction, capturing patterns and spatial
dependencies in the data. CNNs are especially effective for tasks like image recognition, object detection, and image
segmentation.
3. Recurrent Neural Networks (RNN): RNNs are designed to handle sequential data, where the current input depends
on previous inputs. They have loops within their architecture, allowing them to maintain memory of past information.
This memory enables RNNs to model sequences effectively, making them suitable for tasks like speech recognition,
language modeling, and machine translation.
Deep learning:

4. Long Short-Term Memory (LSTM): LSTMs are a specialized type of RNN that addresses the vanishing gradient
problem, which occurs when training deep neural networks. LSTMs have a more complex structure with memory cells,
input gates, forget gates, and output gates. They are widely used for tasks that involve longer-term dependencies, such as
speech recognition, text generation, and sentiment analysis.

5. Generative Adversarial Networks (GAN): GANs consist of two neural networks, a generator and a discriminator,
that compete against each other in a game-theoretic framework. The generator tries to produce realistic data samples,
while the discriminator aims to distinguish between real and generated samples. GANs are popular for generating realistic
images, video synthesis, and data augmentation.

6. Transformer: Transformers are an attention-based model architecture that has gained significant attention in natural
language processing (NLP). Unlike traditional sequential models like RNNs, Transformers can capture dependencies
between words in a sentence simultaneously, enabling parallel computation and improved performance. Transformers have
been used in tasks such as machine translation, question answering, and text summarization.
Deep learning layers:
1. Input Layer: This is the first layer of the neural network, which receives the raw input data. Its role is to pass this input
data to the next layer without modifying it. The number of neurons in this layer corresponds to the dimensionality of the
input data.

2. Dense (Fully Connected) Layer: In this layer, each neuron is connected to every neuron in the previous and next layers.
This dense connectivity allows the network to learn complex patterns in the data. Each connection has an associated
weight, which the network learns during training to make predictions.

3. Convolutional Layer: This layer applies a set of learnable filters (kernels) to the input data using the convolution
operation. Each filter captures different local patterns in the input, allowing the network to extract hierarchical features
such as edges, textures, and shapes from images or spatial data.

4. Pooling Layer: Pooling layers reduce the spatial dimensions of the input data by down-sampling. Common pooling
operations include max pooling and average pooling, which respectively take the maximum or average value from a set
of values within a small window. Pooling helps to decrease the computational complexity of the model and make it more
robust to variations in input data.
Deep learning layers:
5. Recurrent Layer: Recurrent layers are designed to process sequential data by maintaining an internal state (hidden
state) that is updated at each time step. The output of the layer at each time step depends not only on the current
input but also on the previous hidden state, allowing the network to capture temporal dependencies in the data.
6. Dropout Layer: Dropout layers are a regularization technique used during training to prevent overfitting. During
training, a fraction of randomly selected neurons in the layer are temporarily dropped out (set to zero) with a
certain probability. This forces the network to learn more robust features by preventing it from relying too much
on any individual neuron.
7. Batch Normalization Layer: This layer normalizes the activations of the previous layer across the mini-batch of
data. Normalization helps to stabilize and accelerate the training process by reducing internal covariate shift and
allowing higher learning rates.
8. Activation Layer: Activation layers apply non-linear transformations to the output of the previous layer. Common
activation functions include ReLU (Rectified Linear Unit), sigmoid, and tanh. These non-linearities introduce
flexibility into the model, enabling it to learn complex mappings between inputs and outputs.
9. Output Layer: The output layer produces the final predictions or outputs of the network. The number of neurons
in this layer depends on the desired output dimensionality for the task at hand. The activation function used in this
layer depends on the nature of the task (e.g., softmax for multi-class classification, linear for regression).
Well posed learning problems:
A well-defined problem includes not only a clear problem statement but also well-defined evaluation
criteria. This means that the problem statement should precisely outline what needs to be achieved or
predicted, and the evaluation criteria should provide a measurable way to assess the performance of the
solution.

For example, in a classification problem, the problem statement might specify that the task is to classify
emails as either spam or not spam. The evaluation criteria could be the accuracy of the classifier, measured
by the proportion of correctly classified emails in a test dataset.

In a regression problem, the problem statement might involve predicting house prices based on various
features. The evaluation criteria could be the mean squared error (MSE) between the predicted prices and
the actual prices in a test dataset.
Well posed learning problems:
▪ A computer program is said to learn from experience E in context to some task T and some performance
measure P, if its performance on T, as was measured by P, upgrades with experience E.
▪ Any problem can be segregated as well-posed learning problem if it has three traits –
I. Task
II. Performance Measure
III. Experience
▪ Certain examples that efficiently defines the well-posed learning problem are –
1. To better filter emails as spam or not
Task – Classifying emails as spam or not
Performance Measure – The fraction of emails accurately classified as spam or not spam
Experience – Observing you label emails as spam or not spam
2. A checkers learning problem
Task – Playing checkers game
Performance Measure – percent of games won against opposer
Experience – playing implementation games against itself
Well posed learning problems:
3. Handwriting Recognition Problem
Task – Acknowledging handwritten words within portrayal
Performance Measure – percent of words accurately classified
Experience – a directory of handwritten words with given classifications
4. Fruit Prediction Problem
Task – forecasting different fruits for recognition
Performance Measure – able to predict maximum variety of fruits
Experience – training machine with the largest datasets of fruits images
5. Face Recognition Problem
Task – predicting different types of faces
Performance Measure – able to predict maximum types of faces
Experience – training machine with maximum amount of datasets of different face images
6. Automatic Translation of documents
Task – translating one type of language used in a document to other language
Performance Measure – able to convert one language to other efficiently
Experience – training machine with a large dataset of different types of languages
Designing a learning system:
▪ Designing a learning system involves several steps which are discussed below.
1. Problem Definition and Understanding:
1. Clearly define the problem you intend to solve. Understand the context, goals, and objectives of the problem.
2. Determine whether the problem requires supervised learning, unsupervised learning, reinforcement learning, or a
combination of these approaches.
3. Identify the input data (features) and the desired output (target) for the learning system.
2. Data Collection and Preprocessing:
1. Gather relevant data for training, validation, and testing. Data can come from various sources, such as databases,
APIs, sensors, or surveys.
2. Clean the data by handling missing values, outliers, and noisy data.
3. Preprocess the data by transforming and scaling features. This might involve techniques like normalization,
feature extraction, and dimensionality reduction.
3. Feature Engineering:
1. Select or engineer appropriate features that will be used as input for the learning algorithm.
2. Create new features that capture relevant patterns and information from the data.
3. Ensure that the features are meaningful, relevant, and contribute to the learning process.
Designing a learning system:
4.Model Selection:
1. Choose a suitable learning algorithm or model architecture based on the problem type (e.g.,
classification, regression, clustering) and the characteristics of the data.
2. Consider factors such as model complexity and computational requirements.
5.Model Training:
1. Split the dataset into training, testing and validation sets.
2. Use the training data to train the selected model. During this process, the model learns the relationships
between the input features and the target output.
3. Tune hyperparameters using the validation set to optimize the model's performance.
6.Model Evaluation:
1. Evaluate the trained model's performance using the testing set, which the model has not seen during
training.
2. Use appropriate evaluation metrics based on the problem type. For example, accuracy, precision, recall,
F1-score, mean squared error, etc.
3. Analyze the results to understand how well the model is performing and whether it meets the desired
criteria.
Designing a learning system:
7. Model Optimization:
1. If the model's performance is not satisfactory, consider adjusting hyperparameters, experimenting with
different algorithms, or collecting more relevant data.
2. Address issues like overfitting (model performs well on training data but poorly on new data) or
underfitting (model is too simple to capture underlying patterns).

8. Deployment and Integration:


1. Once satisfied with the model's performance, deploy it into the intended environment. This could be a
web application, mobile app, embedded system, etc.
2. Ensure that the deployment environment is compatible with the model's requirements in terms of
computing resources and data input format.

9. Monitoring and Maintenance:


1. Continuously monitor the performance of the deployed model in real-world scenarios.
2. Update the model as new data becomes available or as the problem requirements change.
3. Address issues that arise due to changes in the data distribution or environment.
Designing a learning system:
11. Ethical and Legal Considerations:
1. Consider ethical implications related to data privacy, bias, fairness, and transparency.
2. Ensure compliance with relevant regulations and laws, especially when dealing with sensitive data or
critical applications.
12. Documentation:
1. Document the entire process, including problem formulation, data sources, preprocessing steps, model
selection, training, evaluation, and deployment.
2. Document the decisions made at each step, along with the rationale behind them. This documentation
aids in reproducibility and future reference.
13. Feedback Loop and Iteration:
1. Gather feedback from users, stakeholders, and the performance of the deployed system.
2. Iterate and improve the learning system based on the feedback received, changing requirements, and
emerging technologies.
Designing a learning system:
11. Ethical and Legal Considerations:
1. Consider ethical implications related to data privacy, bias, fairness, and transparency.
2. Ensure compliance with relevant regulations and laws, especially when dealing with sensitive data or
critical applications.
12. Documentation:
1. Document the entire process, including problem formulation, data sources, preprocessing steps, model
selection, training, evaluation, and deployment.
2. Document the decisions made at each step, along with the rationale behind them. This documentation
aids in reproducibility and future reference.
13. Feedback Loop and Iteration:
1. Gather feedback from users, stakeholders, and the performance of the deployed system.
2. Iterate and improve the learning system based on the feedback received, changing requirements, and
emerging technologies.
Bias and variance in machine learning:
▪ A machine learning model analyses the data, find patterns in it and make predictions.
▪ However, if the machine learning model is not accurate, it can make predictions errors, and these prediction errors are
usually known as Bias and Variance.
▪ In machine learning, these errors will always be present as there is always a slight difference between the model
predictions and actual predictions.
▪ The main aim of ML/data science analysts is to reduce these errors (bias and variance) in order to develop a robust
model.
Bias:
▪ While making predictions, a difference occurs between prediction values made by the model and actual
values/expected values, and this difference is known as bias errors or errors due to bias.
▪ It can be defined as an inability of machine learning algorithms such as Linear Regression to capture the true
relationship between the data points.
Low Bias: A low bias model will make fewer assumptions about the form of the target function.
High Bias: A model with a high bias makes more assumptions, and the model becomes unable to capture
the important features of our dataset. A high bias model also cannot perform well on new data.
Bias in machine learning:

Fig : Biasing [8]


▪ The simpler the algorithm, the higher the bias.
▪ A linear algorithm has a high bias (Decision Trees, k-Nearest Neighbours and Support Vector Machines).
▪ A non-linear model has low bias (Linear Regression, Linear Discriminant Analysis and Logistic Regression).
▪ High bias mainly occurs due to a much simple model.
variance error:
▪ The variance would specify the amount of variation in the prediction if the different training data was used.
▪ In simple words, variance tells that how much a random variable is different from its expected value.
▪ Variance errors are either of low variance or high variance.
➢ Low variance means there is a small variation in the prediction of the target function with changes in the training data
set.
➢ High variance shows a large variation in the prediction of the target function with changes in the training dataset.
▪ A model that shows high variance learns a lot and perform well with the training dataset, and does not generalize well
with the unseen dataset.
▪ Since, with high variance, the model learns too much from the dataset, it leads to overfitting of the model. A model
with high variance has the below problems:
➢ A high variance model leads to overfitting.
➢ Increase model complexities.

▪ Low variance machine learning models: linear regression, logistic regression, and linear discriminant analysis.
▪ High variance machine learning models: decision tree, support vector machine, and k-nearest neighbours.
Different Combinations of Bias-Variance:

There are four possible combinations of bias and variances, which are represented by the below diagram.

Fig: Bias variance combination [9]


Different Combinations of Bias-Variance:
There can be four combinations between bias and variance.

▪ High Bias, Low Variance: A model with high bias and low variance is said to be underfitting.

▪ High Variance, Low Bias: A model with high variance and low bias is said to be overfitting.

▪ High-Bias, High-Variance: A model has both high bias and high variance, which means that the model is not able to
capture the underlying patterns in the data (high bias) and is also too sensitive to changes in the training data (high
variance). As a result, the model will produce inconsistent and inaccurate predictions on average.

▪ Low Bias, Low Variance: A model that has low bias and low variance means that the model is able to capture the
underlying patterns in the data (low bias) and is not too sensitive to changes in the training data (low variance). This
is the ideal scenario for a machine learning model, as it is able to generalize well to new, unseen data and produce
consistent and accurate predictions. But in practice, it’s not possible.
Bias-Variance Trade-Off:
▪ While building the machine learning model, it is really important to take care of bias and variance in order to avoid
overfitting and underfitting in the model.
▪ If the model is very simple with fewer parameters, it may have low variance and high bias.
▪ Whereas, if the model has a large number of parameters, it will have high variance and low bias.
▪ So, it is required to make a balance between bias and variance errors, and this balance between the bias error and
variance error is known as the Bias-Variance trade-off.

Fig: Bias variance trade-off [8]


Ways to reduce high bias in Machine Learning:

▪ Use a more complex model: One of the main reasons for high bias is the very simplified model. it will not be able to
capture the complexity of the data. In such cases, we can make our mode more complex by increasing the number of
hidden layers in the case of a deep neural network. Or we can use a more complex model like Polynomial regression
for non-linear datasets, CNN for image processing, and RNN for sequence learning.

▪ Increase the number of features: By adding more features to train the dataset will increase the complexity of the
model. And improve its ability to capture the underlying patterns in the data.

▪ Reduce Regularization of the model: Regularization techniques such as L1 or L2 regularization can help to prevent
overfitting and improve the generalization ability of the model. if the model has a high bias, reducing the strength of
regularization or removing it altogether can help to improve its performance.

▪ Increase the size of the training data: Increasing the size of the training data can help to reduce bias by providing
the model with more examples to learn from the dataset.
Ways to reduce variance in machine learning:

• Cross-validation: By splitting the data into training and testing sets multiple times, cross-validation can help identify
if a model is overfitting or underfitting and can be used to tune hyperparameters to reduce variance.

• Feature selection: By choosing the only relevant feature will decrease the model’s complexity. and it can reduce the
variance error.

• Regularization: We can use L1 or L2 regularization to reduce variance in machine learning models

• Ensemble methods: It will combine multiple models to improve generalization performance. Bagging, boosting, and
stacking are common ensemble methods that can help reduce variance and improve generalization performance.

• Simplifying the model: Reducing the complexity of the model, such as decreasing the number of parameters or
layers in a neural network, can also help reduce variance and improve generalization performance.

• Early stopping: Early stopping is a technique used to prevent overfitting by stopping the training of the deep
learning model when the performance on the validation set stops improving.
Overfitting and underfitting:
▪ Overfitting occurs when a model learns the training data too well, capturing not only the underlying patterns but also
the noise and randomness present in the data.
▪ As a result, an overfit model will perform exceptionally well on the training data but will struggle to generalize to
new, unseen data.
▪ This phenomenon can be thought of as the model "memorizing" the training data rather than learning the true
underlying relationships.

Fig: Bias variance trade-off [10]


Characteristics of overfitting:
▪ Low training error: The model's performance on the training data is excellent.
▪ High validation/testing error: The model's performance on new data is poor.
▪ Model captures noise: The model might fit to outlier points, errors, or random fluctuations in the training data.
▪ Complex models: Overfitting is more likely to occur when using complex models with a large number of parameters.
▪ Poor generalization: The model struggles to make accurate predictions on new, unseen data.
▪ Causes of Overfitting:
▪ Too much complexity: Models with too many parameters can easily fit the noise in the data.
▪ Insufficient data: If the dataset is small, the model may overfit to the limited information available.
▪ Overtraining: Training for too many epochs or iterations can lead to overfitting.
▪ Incorrect features: Including irrelevant or redundant features can contribute to overfitting.
Underfitting:
▪ Underfitting, on the other hand, occurs when a model is too simplistic to capture the underlying patterns in the data.
▪ In this case, the model fails to learn even the training data well, resulting in poor performance on both the training
data and new data.

Fig: Underfitting
Example:
Characteristics of underfitting :
▪ High training error: The model's performance on the training data is not good.
▪ High validation/testing error: The model's performance on new data is also poor.
▪ Model is too simple: The model lacks the capacity to capture the underlying relationships in the data.
▪ Oversimplified features: The model might not be able to understand the complexities of the data.

• Causes of Underfitting:
• Simplistic model: Using a model with too few parameters or overly simplified structure.
• Insufficient training: Not training the model for enough epochs or iterations.
• Insufficient features: Lack of relevant features or using overly generalized features.
Vapnik-chervonenkis (VC) dimension:
▪ Vapnik-Chervonenkis (VC) dimension measures the capacity of a hypothesis space (set of possible functions or
classifiers) to shatter a given set of points.
1. Shattering: The concept of shattering refers to whether a hypothesis space can classify (or label) a given set of points
in all possible ways. In other words, if a hypothesis space can assign arbitrary labels to a set of points, it is said to
shatter those points.
2. VC Dimension: The VC dimension of a hypothesis space is the maximum number of points that can be shattered by
the space. In other words, it's the largest dataset size for which the hypothesis space can represent all possible
labelings.

Fig: Shattering with 3 points [10]


Sensitivity Analysis:
Sensitivity analysis in machine learning involves assessing how changes in input variables affect the output of a
model. It helps identify which inputs have the most significant impact on the model's predictions and can be useful for
feature selection, understanding model behavior, and assessing robustness.

Fig:Correlation matrix for the five attributes of interest in the diamonds dataset
Sensitivity Analysis:

Fig: Pie chart showing the average contribution of each diamond attribute on the diamond price
Concept learning:
▪ In Machine Learning, concept learning can be termed as “a problem of searching through a predefined space of potential
hypothesis for the hypothesis that best fits the training examples” – Tom Mitchell.

Fig: Concept learning [11]


▪ Let’s try to understand concept learning with a real-life example. Most of human learning is based on past instances or
experiences.
▪ For example, we are able to identify any type of vehicle based on a certain set of features like make, model, etc., that are
defined over a large set of features.
▪ These special features differentiate the set of cars, trucks, etc from the larger set of vehicles. These features that define the
set of cars, trucks, etc are known as concepts.
▪ Similar to this, machines can also learn from concepts to identify whether an object belongs to a specific category or not.
Concept learning:
▪ Concept learning is the process of learning to recognize and categorize objects or situations based on their attributes
and relations.
▪ For example, a concept learning system might learn to identify different types of animals based on their shape, size,
color, and behavior.
▪ Concept learning can be seen as a form of inductive learning, where the system infers general rules or principles from
specific observations or examples.
▪ Concept learning can also be divided into two types: supervised and unsupervised.
▪ Supervised concept learning involves learning from labeled examples, where the system knows the correct category
or outcome for each input.
▪ Unsupervised concept learning involves learning from unlabeled examples, where the system has to discover the
underlying structure or similarity among the inputs.
What is find-S algorithm in machine learning?
▪ In order to understand Find-S algorithm, you need to have a basic idea of the following concepts as well:
▪ Concept Learning: discussed already.
▪ General Hypothesis: Hypothesis, in general, is an explanation for something. The general hypothesis basically states
the general relationship between the major variables.
For example, a general hypothesis for ordering food would be I want a burger.
G = { ‘?’, ‘?’, ‘?’, …..’?’}
▪ Specific Hypothesis: The specific hypothesis fills in all the important details about the variables given in the general
hypothesis.
▪ The more specific details into the example given above would be I want a cheeseburger with a chicken pepperoni
filling with a lot of lettuce.
S = {‘Φ’,’Φ’,’Φ’, ……,’Φ’}
• Representations:
The most specific hypothesis is represented using ϕ.
The most general hypothesis is represented using ?.
What is find-S algorithm in machine learning?
▪ The find-S algorithm finds the most specific hypothesis that fits all the positive examples.
▪ We have to note here that the algorithm considers only those positive training example.
▪ The find-S algorithm starts with the most specific hypothesis and generalizes this hypothesis each time it fails to
classify an observed positive training data.
▪ Hence, the Find-S algorithm moves from the most specific hypothesis to the most general hypothesis.

▪ Find-S algorithm, is a machine learning algorithm that seeks to find a maximally specific hypothesis based on labeled
training data. It starts with the most specific hypothesis and generalizes it by incorporating positive examples. It
ignores negative examples during the learning process.

▪ The algorithm's objective is to discover a hypothesis that accurately represents the target concept by progressively
expanding the hypothesis space until it covers all positive instances.
Find-S algorithm follows the steps written below:
▪ Start with the most specific hypothesis i.e. h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
▪ Take the next example and if it is negative, then no changes occur to the hypothesis.
▪ If the example is positive and we find that our initial hypothesis is too specific then we update our current hypothesis
to a general condition.
▪ Keep repeating the above steps till all the training examples are complete.
▪ After we have completed all the training examples we will have the final
hypothesis when can use to classify the new examples.

Fig: Find S algorithm [12]


Example: Find-S algorithm
▪ Consider the following data set having the data about which particular seeds are poisonous.
Example: Find-S algorithm
▪ First, we consider the hypothesis to be a more specific hypothesis. Hence, our hypothesis would be :
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}

▪ Consider example 1 :
The data in example 1 is { GREEN, HARD, NO, WRINKLED }. We see that our initial hypothesis is more specific and
we have to generalize it for this example. Hence, the hypothesis becomes :
h = { GREEN, HARD, NO, WRINKLED }

• Consider example 2 :
Here we see that this example has a negative outcome. Hence we neglect this example and our hypothesis remains the
same.
h = { GREEN, HARD, NO, WRINKLED }

• Consider example 3 :
Here we see that this example has a negative outcome. Hence we neglect this example and our hypothesis remains the
same.
h = { GREEN, HARD, NO, WRINKLED }
Example: Find-S algorithm
• Consider example 4 :
The data present in example 4 is { ORANGE, HARD, NO, WRINKLED }. We compare every single attribute with
the initial data and if any mismatch is found we replace that particular attribute with a general case ( ” ? ” ). After
doing the process the hypothesis becomes :
h = { ?, HARD, NO, WRINKLED }
• Consider example 5 :
The data present in example 5 is { GREEN, SOFT, YES, SMOOTH }. We compare every single attribute with the
initial data and if any mismatch is found we replace that particular attribute with a general case ( ” ? ” ). After doing
the process the hypothesis becomes :
h = { ?, ?, ?, ? }
• Since we have reached a point where all the attributes in our hypothesis have the general condition, example 6 and
example 7 would result in the same hypothesizes with all general attributes.
h = { ?, ?, ?, ? }
• Hence, for the given data the final hypothesis would be :
Final Hypothesis: h = { ?, ?, ?, ? }
Example 2: Find S algorithm
• Looking at the data set, we have six attributes and a final attribute that defines the positive or negative
example. In this case, yes is a positive example, which means the person will go for a walk.

Temperat
Time Weather Company Humidity Wind Goes
ure
Morning Sunny Warm Yes Mild Strong Yes
Evening Rainy Cold No Mild Normal No
Morning Sunny Moderate Yes Normal Normal Yes
Evening Sunny Cold Yes High Strong Yes

So now, the general hypothesis is:


h0 = {‘Morning’, ‘Sunny’, ‘Warm’, ‘Yes’, ‘Mild’, ‘Strong’}
This is our general hypothesis, and now we will consider each example one by one, but only the positive
examples.
h1= {‘Morning’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}
h2 = {‘?’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}
We replaced all the different values in the general hypothesis to get a resultant hypothesis.
Python code for find S algorithm:
import pandas as pd
import numpy as np
#to read the data in the csv file
data = pd.read_csv("data.csv")
print(data,"n")
#making an array of all the attributes
d = np.array(data)[:,:-1]
print("n The attributes are: ",d)
#segragating the target that has positive and negative examples
target = np.array(data)[:,-1]
print("n The target is: ",target)
#training function to implement find-s algorithm
def train(c,t):
for i, val in enumerate(t):
if val == "Yes":
specific_hypothesis = c[i].copy()
break
for i, val in enumerate(c):
if t[i] == "Yes":
for x in range(len(specific_hypothesis)):
if val[x] != specific_hypothesis[x]:
specific_hypothesis[x] = '?'
else:
pass
return specific_hypothesis
#obtaining the final hypothesis
print("n The final hypothesis is:",train(d,target))
Find S algorithm: Output
Find S algorithm: Output
Find S algorithm: Output
Candidate Elimination Algorithm:
• Given a hypothesis space H and a collection E of instances, the candidate elimination procedure develops the
version space progressively.
• The examples are added one by one; each example possibly shrinks the version space by removing the
hypotheses that are inconsistent with the example.
• The candidate elimination algorithm does this by updating the general and specific boundary for each new
example.
• Consider both positive and negative examples.
• It relies on the concept of version space.
• For a positive example, we move from the most specific hypothesis to the most general hypothesis.
• For a negative example, we move from the most general hypothesis to the most specific hypothesis.
• At the end of the algorithm, we get both specific and general hypotheses as our final solution.
Terms Used:
• Concept learning: Concept learning is basically the learning task of the machine (Learn by Train data)
• General Hypothesis: Not Specifying features to learn the machine.
G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
• Specific Hypothesis: Specifying features to learn machine (Specific feature)
S= {‘phi’,’phi’,’phi’…}: The number of phi depends on a number of attributes.
• Version Space: It is an intermediate of general hypothesis and Specific hypothesis. It not only just writes one
hypothesis but a set of all possible hypotheses based on training data-set.
Algorithm:

Fig: Candidate elimination method [13]


Candidate Elimination Algorithm:
• 1. Initialize both specific and general hypotheses.
S = < ‘ϕ’, ‘ϕ’, ‘ϕ’, ….., ‘ϕ’ >
G = < ‘?’, ‘?’, ‘?’, ….., ’?’>
Depending on the number of attributes.
• 2. Take the next example, if the taken example is positive make a specific hypothesis to general.
• 3. If the taken example is negative make the general hypothesis to a more specific hypothesis.
Table: dataset for candidate elimination method [13]
Candidate elimination algorithm:
• Initially : G = [[?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?]]
S = [Null, Null, Null, Null, Null, Null]
• For instance 1: <'sunny','warm','normal','strong','warm ','same'> and positive output.
G1 = G
S1 = ['sunny','warm','normal','strong','warm ','same']
• For instance 2: <'sunny','warm','high','strong','warm ','same'> and positive output.
G2 = G
S2 = ['sunny','warm',?,'strong','warm ','same']
▪ For instance 3: <'rainy','cold','high','strong','warm ','change'> and negative output.
G3 = [['sunny', ?, ?, ?, ?, ?], [?, 'warm', ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, 'same']]
S3 = S2
• For instance 4: <'sunny','warm','high','strong','cool','change'> and positive output.
G4 = G3
S4 = ['sunny','warm',?,'strong', ?, ?]
At last, by synchronizing the G4 and S4 algorithm produce the output.
Candidate elimination algorithm: Output
G = [['sunny', ?, ?, ?, ?, ?], [?, 'warm', ?, ?, ?, ?]]
S = ['sunny','warm',?,'strong', ?, ?]
Training Example:

Sky Temp Humidity wind water Forecast sport


Sunny warm Normal Strong warm same Yes
Sunny warm High Strong warm same Yes
Rainy cold High Strong warm change No
Sunny warm High Strong cool change Yes

Step 1:
Initialize G & S as most General and specific hypothesis.
G ={'?', '?','?','?', '?','?'}

S = {'φ','φ','φ','φ','φ','φ'}
Step 2:
for each +ve example: make a specific hypothesis more general.
s = {'φ','φ','φ','φ','φ','φ'}
Take the most specific hypothesis as your 1st positive instance.

h={'sunny', 'warm','Normal', 'Strong', 'warm', 'same'}


General hypothesis will remain same: G ={'?', '?','?','?', '?','?'}
Training Example:

• Step 3:
Compare with another positive instance for each attribute.
if (attribute value = hypothesis value) do nothing.
else
replace the hypothesis value with more general constraint '?'.
Since instance 2 is also positive so we will compare with it. In instance 2 attribute humidity is changing so we will generalize
that attribute.
• S={'sunny', 'warm','?', 'Strong', 'warm', 'same'}

General hypothesis will remain same: G ={'?', '?','?','?', '?','?'}

Step 4:
Instance 3 is negative so for each -ve example make general hypothesis more specific.
we will make the general hypothesis more specific by comparing all the attributes of the negative instance with the positive
instance if attribute found different to create a dedicated set for the attribute.
G ={<'sunny', '?','?','?', '?','?'> , <'?', 'warm','?','?', '?','?'> , <'?', '?','Normal','?', '?','?'> , < '?', '?','?','?', '?','same'>}

Specific hypothesis will be same: S={'sunny', 'warm','?', 'Strong', 'warm', 'same'}


Training Example:
• step 5:
• Instance 4 is positive so repeat step 3: S={'sunny', 'warm','?', 'Strong', '?', '?'}
• Discard the general hypothesis set which is contradicting with a resultant specific hypothesis here humidity and forecast attribute is
contradicting. G ={<'sunny', '?','?','?', '?','?'> , <'?', 'warm','?','?', '?','?'> }
• Maximally specific and general hypothesis are:
• S={'sunny', 'warm','?', 'Strong', '?', '?'}
• G ={<'sunny', '?','?','?', '?','?'> , <'?', 'warm','?','?', '?','?'> }
Specific hypothesis will be same: S={'sunny', 'warm','?', 'Strong', 'warm', 'same’}

Fig: Version space [14]


Candidate elimination example:
Candidate elimination example:
Candidate elimination example:
Candidate elimination example:
Candidate elimination example:
Candidate elimination example:
Solved Numerical Example – 2 (Candidate Elimination Algorithm)

Example Size Color Shape Class/Label

1 Big Red Circle No


2 Small Red Triangle No
3 mall Red Circle Yes
4 Big Blue Circle No
5 Small Blue Circle Yes

• Solution:
• S0: (0, 0, 0) Most Specific Boundary
• G0: (?, ?, ?) Most Generic Boundary
• The first example is negative, the hypothesis at the specific boundary is consistent, hence we retain it, and the
hypothesis at the generic boundary is inconsistent hence we write all consistent hypotheses by removing one
“?” at a time.
• S1: (0, 0, 0)
• G1: (Small, ?, ?), (?, Blue, ?), (?, ?, Triangle)
Continue:
The second example is negative, the hypothesis at the specific boundary is consistent, hence we retain it, and the
hypothesis at the generic boundary is inconsistent hence we write all consistent hypotheses by removing one “?” at a time.
S2: (0, 0, 0)
G2: (Small, Blue, ?), (Small, ?, Circle), (?, Blue, ?), (Big, ?, Triangle), (?, Blue, Triangle)

The third example is positive, the hypothesis at the specific boundary is inconsistent, hence we extend the specific
boundary, and the consistent hypothesis at the generic boundary is retained and inconsistent hypotheses are removed from
the generic boundary.
S3: (Small, Red, Circle)
G3: (Small, ?, Circle)

The fourth example is negative, the hypothesis at the specific boundary is consistent, hence we retain it, and the hypothesis
at the generic boundary is inconsistent hence we write all consistent hypotheses by removing one “?” at a time.
S4: (Small, Red, Circle)
G4: (Small, ?, Circle)
Continue:
The second example is negative, the hypothesis at the specific boundary is consistent, hence we retain it, and the
hypothesis at the generic boundary is inconsistent hence we write all consistent hypotheses by removing one “?” at a time.
S2: (0, 0, 0)
G2: (Small, Blue, ?), (Small, ?, Circle), (?, Blue, ?), (Big, ?, Triangle), (?, Blue, Triangle)
The third example is positive, the hypothesis at the specific boundary is inconsistent, hence we extend the specific
boundary, and the consistent hypothesis at the generic boundary is retained and inconsistent hypotheses are removed from
the generic boundary.
S3: (Small, Red, Circle)
G3: (Small, ?, Circle)
The fourth example is negative, the hypothesis at the specific boundary is consistent, hence we retain it, and the hypothesis
at the generic boundary is inconsistent hence we write all consistent hypotheses by removing one “?” at a time.
S4: (Small, Red, Circle)
G4: (Small, ?, Circle)
The fifth example is positive, the hypothesis at the specific boundary is inconsistent, hence we extend the specific
boundary, and the consistent hypothesis at the generic boundary is retained and inconsistent hypotheses are removed from
the generic boundary.
S5: (Small, ?, Circle)
G5: (Small, ?, Circle)
Advantages and disadvantages:
Advantages of CEA over Find-S:
1.Improved accuracy: CEA considers both positive and negative examples to generate the hypothesis, which can result in
higher accuracy when dealing with noisy or incomplete data.
2.Flexibility: CEA can handle more complex classification tasks, such as those with multiple classes or non-linear decision
boundaries.
3.More efficient: CEA reduces the number of hypotheses by generating a set of general hypotheses and then eliminating
them one by one. This can result in faster processing and improved efficiency.
4.Better handling of continuous attributes: CEA can handle continuous attributes by creating boundaries for each attribute,
which makes it more suitable for a wider range of datasets.
Disadvantages of CEA in comparison with Find-S:
1.More complex: CEA is a more complex algorithm than Find-S, which may make it more difficult for beginners or those
without a strong background in machine learning to use and understand.
2.Higher memory requirements: CEA requires more memory to store the set of hypotheses and boundaries, which may
make it less suitable for memory-constrained environments.
3.Slower processing for large datasets: CEA may become slower for larger datasets due to the increased number of
hypotheses generated.
4.Higher potential for overfitting: The increased complexity of CEA may make it more prone to overfitting on the training
data, especially if the dataset is small or has a high degree of noise.
Issues in machine learning:
1. Bias and Fairness:
Issue: Bias in training data can lead to discriminatory or unfair predictions, disproportionately affecting certain groups.
Example: A hiring model trained on historical data might unfairly favor male candidates if the past hiring decisions
were biased towards male applicants.
2. Data Quality and Quantity:
Issue: Inaccurate or insufficient data can lead to poor model performance.
Example: A weather forecasting model trained on incomplete or incorrect weather data might struggle to accurately
predict future weather patterns.
3. Overfitting and Underfitting:
Issue: Overfitting occurs when a model captures noise in the training data and doesn't generalize well to new data,
while underfitting is when the model is too simple to capture underlying patterns.
Example: An overfitted spam email classifier memorizes specific words in the training data, leading to poor
performance on new emails.
Issues in machine learning with example:
4. Interpretable Models:
Issue: Complex models like deep neural networks can be difficult to interpret, making it challenging to understand why
a certain prediction was made.
Example: A medical diagnosis model based on a neural network might accurately diagnose patients, but doctors may
struggle to explain the reasoning behind the predictions.
5. Feature Engineering:
Issue: Selecting relevant features and engineering them properly is crucial for model performance.
Example: Building a sentiment analysis model for movie reviews requires identifying and representing important
features like sentiment-bear]ing words.
6. Computational Resources:
Issue: Training large and complex models can be computationally expensive and require powerful hardware.
Example: Training a deep learning model for image recognition might require specialized GPUs to process the vast
amount of data efficiently.
7. Scalability:
Issue: Adapting machine learning solutions to handle large datasets or real-time applications can be challenging.
Example: An e-commerce recommendation system needs to quickly process user interactions and adjust
recommendations in real-time as more users interact with the platform.
Issues in machine learning with example:
8. Ethical Considerations:
Issue: Machine learning applications can raise ethical concerns, such as privacy violations or biased decision-making.
Example: An AI-powered lending model might unfairly deny loans to certain demographic groups, perpetuating
historical biases.
9. Model Robustness:
Issue: Models can be sensitive to small changes in input data, making them susceptible to adversarial attacks.
Example: An autonomous vehicle's image recognition system might misinterpret a small sticker on a stop sign, leading
to a potentially dangerous situation.
10. Continual Learning:
Issue: Traditional models might struggle to adapt to new data over time without forgetting previous knowledge.
Example: A language translation model needs to continuously learn and incorporate new language patterns as
languages evolve.
Machine learning vs data science:
Aspect Machine Learning Data Science
Building algorithms to learn from data and make Extracting insights from data to inform decisions
Focus
predictions or decisions and strategies
Decision trees, neural networks, support vector Statistical analysis, data visualization, data
Techniques
machines, etc. preprocessing, etc.

Develop models that improve performance over


Main Goal Derive insights, trends, and patterns from data
time through experience
Training models, prediction, classification,
Activities Data collection, cleaning, analysis, visualization
regression
Algorithm design, model evaluation, feature
Expertise Statistical analysis, domain expertise, programming
engineering

Role Specialized subset of data science Broad field encompassing various activities

Data
Utilizes data to train and improve models Utilizes data for analysis and decision-making
Utilization
Integration Used within data science for predictive modeling Part of the broader data analysis process
Dependency Requires data for training and evaluation Requires data for analysis and insights
Quiz:
Q.1 In supervised learning, what type of data does the model learn from?
a) Unlabeled data
b) Labeled data
c) Noisy data
d) Both labeled and unlabeled data
Q.2 What type of data does the model learn from in unsupervised learning?
a) Labeled data
b) Noisy data
c) Unlabeled data
d) Both labeled and unlabeled data
Q.3 What does semi-supervised learning utilize?
a) Only labeled data
b) Only unlabeled data
c) Both labeled and unlabeled data
d) Noisy data
Quiz:
Q.4 In reinforcement learning, how does the model learn to make decisions?
a) By receiving labeled data
b) By interacting with an environment and receiving feedback
c) By clustering data points
d) By memorizing patterns in the data
Q.5 What is the primary characteristic of unsupervised learning?
a)The model learns from both labeled and unlabeled data, using a combination of supervised and unsupervised
techniques.
b) Input data is labeled, and the model learns to map input to output based on provided examples.
c) Input data is not labeled, and the model learns to find patterns or structure in the data.
d) The model interacts with an environment, receiving feedback in the form of rewards or penalties.
Q.6 What is an example of supervised learning?
a) Clustering
b) Reinforcement learning
c) Classification
d) Dimensionality reduction
Quiz:
Q.7 What is transfer learning in machine learning?
a) It refers to transferring data between different devices.
b) It involves transferring knowledge from one machine learning task to another.
c) It is the process of transferring data from a local machine to a cloud server.
d) It involves transferring data from one domain to another without any modifications.
Q.8 What is a key characteristic of deep learning algorithms?
a) They require a small amount of data to train effectively.
b) They only work with shallow neural networks.
c) They involve the use of multiple layers to learn hierarchical representations of data.
d) They are not suitable for processing unstructured data.
Q.9 What is a characteristic of a well-defined learning problem?
a) Ambiguity in the desired output
b) Lack of available data
c) Clear specification of input and output
d) Complexity beyond current technology
Quiz:
Q.10 What is an example of bias in machine learning?
a) Selecting a model that is too simple
b) Selecting a model that is too complex
c) Failing to consider certain features that are important for prediction
d) Fitting the training data too closely
Q.11 What is the main goal of the Candidate Elimination Algorithm?
a) To find the best hyperparameters for a model
b) To eliminate candidates for the final model based on their performance
c) To identify the most suitable algorithm for a given dataset
d) To incrementally update the version space based on observed data
Q.12 What does inductive bias refer to in machine learning?
a) The inherent limitations of the learning algorithm
b) The bias introduced by the data collection process
c) The bias towards simpler models
d) The bias towards complex models
Quiz:
Q.13 What does sensitivity analysis in machine learning involve?
a) Analyzing the sensitivity of a model's predictions to changes in its parameters
b) Analyzing the sensitivity of the dataset
c) Sensing the environment for data
d) Analyzing the sensitivity of the loss function
Q.14 What is underfitting in machine learning?
a) When a model performs well on the training data but poorly on unseen data
b) When a model performs poorly on both the training and unseen data
c) When a model is too complex and captures noise in the training data
d) When a model fails to capture the underlying patterns in the data
Q.15 What is overfitting in machine learning?
a) The model fits the training data too closely and fails to generalize well to unseen test data.
b) The model's predictions are highly sensitive to small changes in the training data, leading to inaccurate
performance.
c) The model performs poorly on both the training and unseen test data due to underfitting.
d) The model tends to oversimplify the underlying patterns in the data, resulting in biased predictions.
References:
1. https://data-flair.training/blogs/machine-learning-tutorial/
2. https://www.javatpoint.com/supervised-machine-learning
3. https://towardsdatascience.com/unsupervised-machine-learning-example-
in-keras-8c8bf9e63ee0
4. https://www.enjoyalgorithms.com/blogs/supervised-unsupervised-and-
semisupervised-learning
5. https://techvidvan.com/tutorials/reinforcement-
learning/#:~:text=What%20is%20Reinforcement%20Learning?
6. https://www.javatpoint.com/transfer-learning-in-machine-learning
7. https://john.sisler.info/resume/deep-learning-specialization/neural-
networks-and-deep-learning
8. https://www.simplilearn.com/tutorials/machine-learning-
tutorial/classification-in-machine-learning
9. https://www.javatpoint.com/bias-and-variance-in-machine-learning
10. https://www.superannotate.com/blog/overfitting-and-underfitting-in-
machine-learning
References:
11. https://www.geeksforgeeks.org/ml-find-s-algorithm/
12. https://www.edureka.co/blog/find-s-algorithm-in-machine-learning/
13. https://www.geeksforgeeks.org/ml-candidate-elimination-algorithm/?ref=gcse
14. https://www.getwayssolution.com/2019/12/candidate-elimination-algorithm-concept.html

You might also like