DL Unit 1
DL Unit 1
Q1. Discuss about Artificial Intelligence, Machine learning Discuss the applications of AI
Artificial intelligence:
1. It is the effort to automate intellectual tasks normally performed by humans.
2. Artificial intelligence is the simulation of human intelligence processes by machines,
especially computer systems.
3. As such, AI is a general field that encompasses machine learning and deep learning, but that
also includes many more approaches that don’t involve any learning.
4. Early chess programs, for instance, only involved hardcoded rules crafted by programmers,
and didn’t qualify as machine learning.They lacked the ability to improve or adapt their
gameplay through self-learning or data-driven techniques, which are characteristic of
machine learning approaches.
5. Specific applications of AI include expert systems, natural language processing, speech
recognition and machine vision.
Machine learning:
1. In classical programming, the paradigm of symbolic AI, humans input rules (a program) and
data to be processed according to these rules, and out come answers (see figure 1.2).
2. With machine learning, humans input data as well as the answers expected from the data, and
out come the rules. These rules can then be applied to new data to produce original answers.
4. Although machine learning only started to flourish in the 1990s, it has quickly become the
most popular and most successful subfield of AI, a trend driven by the availability of faster
hardware and larger datasets.
5. Machine learning is tightly related to mathematical statistics, but it differs from statistics in
several important ways.
6. Unlike statistics, machine learning tends to deal with large, complex datasets (such as a
dataset of millions of images, each consisting of tens of thousands of pixels) for which
classical statistical analysis such as Bayesian analysis would be impractical.
Learning representations from data:
Machine learning discovers rules to execute a data-processing task, given examples of what’s expected. So, to
do machine learning, we need three things:
a. Input data points—For instance, if the task is speech recognition, these data points
could be sound files of people speaking. If the task is image tagging, they could be
pictures.
b. Examples of the expected output—In a speech-recognition task, these could be
human-generated transcripts of sound files. In an image task, expected outputs could
be tags such as “dog,” “cat,” and so on.
c. A way to measure whether the algorithm is doing a good job—This is necessary in
order to determine the distance between the algorithm’s current output and its
expected output. The measurement is used as a feedback signal to adjust the way the
algorithm works. This adjustment step is what we call learning.
d. What we need here is a new representation of our data that cleanly separates the white
points from the black points. One transformation we could use, among many other
possibilities, would be a coordinate change, illustrated in figure.
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
e. In this new coordinate system, the coordinates of our points can be said to be a
new representation of our data.
f. With this representation, the black/white classification problem can be expressed
as a simple rule: “Black points are such that x > 0,” or “White points are such that
x < 0.”
g. This new representation basically solves the classification problem. In this case,
we defined the coordinate change by hand.
h. But if instead we tried systematically searching for different possible coordinate
changes, and used as feedback the percentage of points being correctly classified,
then we would be doing machine learning.
i. Learning, in the context of machine learning, describes an automatic search
process for better representations.
The “deep” in deep learning:
The deep in deep learning isn’t a reference to any kind of deeper understanding achieved by the
approach; rather, it stands for this idea of successive layers of representations.
How many layers contribute to a model of the data is called the depth of the model. Modern deep
learning often involves tens or even hundreds of successive layers of representations— and they’re all
learned automatically from exposure to training data.
Shallow learning: Other approaches to machine learning tend to focus on learning only one or two layers of
representations of the data; hence, they’re sometimes called shallow learning
AI vs ML vs DL:
The following image gives some glimpse of the relation among Artificial Intelligence, Machine Learning and
Deep Learning., describes the examples.
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
AI stands for Artificial ML stands for Machine DL stands for Deep Learning, and is the
Intelligence, and is basically Learning, and is the study study that makes use of Neural
the study/process which that uses statistical Networks(similar to neurons present in
enables machines to mimic methods enabling human brain) to imitate functionality just
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
Personalized recommendations: E-commerce sites and streaming services like Amazon and Netflix
use AI algorithms to analyze users’ browsing and viewing history to recommend products and content
that they are likely to be interested in.
Predictive maintenance: AI-powered predictive maintenance systems analyze data from sensors and
other sources to predict when equipment is likely to fail, helping to reduce downtime and maintenance
costs.
Medical diagnosis: AI-powered medical diagnosis systems analyze medical images and other patient
data to help doctors make more accurate diagnoses and treatment plans.
Autonomous vehicles: Self-driving cars and other autonomous vehicles use AI algorithms and
sensors to analyze their environment and make decisions about speed, direction, and other factors.
Virtual Personal Assistants (VPA) like Siri or Alexa – these use natural language processing to
understand and respond to user requests, such as playing music, setting reminders, and answering
questions.
Autonomous vehicles – self-driving cars use AI to analyze sensor data, such as cameras and lidar, to
make decisions about navigation, obstacle avoidance, and route planning.
Fraud detection – financial institutions use AI to analyze transactions and detect patterns that are
indicative of fraud, such as unusual spending patterns or transactions from unfamiliar locations.
Image recognition – AI is used in applications such as photo organization, security systems, and
autonomous robots to identify objects, people, and scenes in images.
Natural language processing – AI is used in chatbots and language translation systems to understand
and generate human-like text.
Predictive analytics – AI is used in industries such as healthcare and marketing to analyze large
amounts of data and make predictions about future events, such as disease outbreaks or consumer
behavior.
Game-playing AI – AI algorithms have been developed to play games such as chess, Go, and poker
at a superhuman level, by analyzing game data and making predictions about the outcomes of moves.
Machine Learning (ML) is a subset of Artificial Intelligence (AI) that involves the use of algorithms and
statistical models to allow a computer system to “learn” from data and improve its performance over time,
without being explicitly programmed to do so.
Speech recognition: Machine learning algorithms are used in speech recognition systems to
transcribe speech and identify the words spoken. These systems are used in virtual assistants like Siri
and Alexa, as well as in call centers and other applications.
Natural language processing (NLP): Machine learning algorithms are used in NLP systems to
understand and generate human language. These systems are used in chatbots, virtual assistants, and
other applications that involve natural language interactions.
Recommendation systems: Machine learning algorithms are used in recommendation systems to
analyze user data and recommend products or services that are likely to be of interest. These systems
are used in e-commerce sites, streaming services, and other applications.
Sentiment analysis: Machine learning algorithms are used in sentiment analysis systems to classify
the sentiment of text or speech as positive, negative, or neutral. These systems are used in social
media monitoring and other applications.
Predictive maintenance: Machine learning algorithms are used in predictive maintenance systems to
analyze data from sensors and other sources to predict when equipment is likely to fail, helping to
reduce downtime and maintenance costs.
Spam filters in email – ML algorithms analyze email content and metadata to identify and flag
messages that are likely to be spam.
Recommendation systems – ML algorithms are used in e-commerce websites and streaming services
to make personalized recommendations to users based on their browsing and purchase history.
Predictive maintenance – ML algorithms are used in manufacturing to predict when machinery is
likely to fail, allowing for proactive maintenance and reducing downtime.
Credit risk assessment – ML algorithms are used by financial institutions to assess the credit risk of
loan applicants, by analyzing data such as their income, employment history, and credit score.
Customer segmentation – ML algorithms are used in marketing to segment customers into different
groups based on their characteristics and behavior, allowing for targeted advertising and promotions.
Fraud detection – ML algorithms are used in financial transactions to detect patterns of behavior that
are indicative of fraud, such as unusual spending patterns or transactions from unfamiliar locations.
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
Speech recognition – ML algorithms are used to transcribe spoken words into text, allowing for
voice-controlled interfaces and dictation software.
Generative models: Deep learning algorithms are used in generative models to create new content
based on existing data. These systems are used in image and video generation, text generation, and
other applications.
Autonomous vehicles: Deep learning algorithms are used in self-driving cars and other autonomous
vehicles to analyze sensor data and make decisions about speed, direction, and other factors.
Image classification – Deep Learning algorithms are used to recognize objects and scenes in images,
such as recognizing faces in photos or identifying items in an image for an e-commerce website.
Speech recognition – Deep Learning algorithms are used to transcribe spoken words into text,
allowing for voice-controlled interfaces and dictation software.
Natural language processing – Deep Learning algorithms are used for tasks such as sentiment
analysis, language translation, and text generation.
Recommender systems – Deep Learning algorithms are used in recommendation systems to make
personalized recommendations based on users’ behavior and preferences.
Fraud detection – Deep Learning algorithms are used in financial transactions to detect patterns of
behavior that are indicative of fraud, such as unusual spending patterns or transactions from
unfamiliar locations.
Game-playing AI – Deep Learning algorithms have been used to develop game-playing AI that can
compete at a superhuman level, such as the AlphaGo AI that defeated the world champion in the
game of Go.
Time series forecasting – Deep Learning algorithms are used to forecast future values in time series
data, such as stock prices, energy consumption, and weather patterns.
Ack: https://www.geeksforgeeks.org/
Q3. How the machine learning evolved explain through Probabilistic Modeling, Early Neural Networks,
Kernel Methods
Probabilistic modeling:
• Probabilistic modeling is the application of the principles of statistics to data analysis.
• It was one of the earliest forms of machine learning, and it’s still widely used to this day
• One of the best-known algorithms in this category is the Naive Bayes algorithm.
• This form of data analysis predates computers and was applied by hand decades before its first
computer implementation (most likely dating back to the 1950s). or This method of data analysis
existed before computers and was done manually long before it was implemented using computers
(probably starting in the 1950s).
• One of the best-known algorithms in this category is the Naive Bayes algorithm.
• A similar model is called logistic regression, often abbreviated as "logreg," which is sometimes
known as the "hello world" of modern machine learning
•
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
•
• Like Naive Bayes, it has been around for a long time, even before computers, but remains valuable
due to its simplicity and flexibility
• Despite its name, logreg is actually a classification algorithm, not a regression one.
• It's often the first algorithm data scientists try on a dataset to understand the classification task
• Logistic regression: The logistic regression model is used for binary classification problems. It
estimates the probability that an instance belongs to a certain class.
• The logistic function (sigmoid) is used to map the output to the range [0, 1]:
• b0, b1, b2, ..., bn are the coefficients (weights) of the features x1, x2, ..., xn.
• exp is the exponential function.
Early NN:
1. Early iterations of neural networks have been completely supplanted by the modern variants
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
2. Although the core ideas of neural networks were investigated in toy forms as early as the
1950s, the approach took decades to get started.
3. For a long time, the missing piece was an efficient way to train large neural networks.
4. This changed in the mid-1980s, when multiple people independently rediscovered the
Backpropagation algorithm— a way to train chains of parametric operations using gradient-
descent optimization—and started applying it to neural networks.
5. The first successful practical application of neural nets came in 1989 from Bell Labs, when
Yann LeCun combined the earlier ideas of convolutional neural networks and
backpropagation, and applied them to the problem of classifying handwritten digits.
6. The resulting network, dubbed LeNet, was used by the United States Postal Service in the
1990s to automate the reading of ZIP codes on mail envelopes
Kernel methods:
• SVM Kernel:In the 1990s, neural networks gained recognition among researchers.
• However, a new approach called "kernel methods" became popular.
• Kernel methods quickly overshadowed neural networks during that time.
• An older linear formulation was published by Vapnik and Alexey Chervonenkis in 1963.
• Conclusions:
iii. Neural networks experienced a decline during this period but later resurged in the deep
learning era.
• SVMs proceed to find these boundaries in two steps:
• 1. The data is mapped to a new high-dimensional representation where the decision boundary
can be expressed as a hyperplane (if the data was two dimensional, as in figure, a hyperplane would
be a straight line).
• 2 A good decision boundary (a separation hyperplane) is computed by trying to maximize the
distance between the hyperplane and the closest data points from each class, a step called maximizing
the margin
• This allows the boundary to generalize well to new samples outside of the training dataset.
Q4. How the machine learning evolved explain through Decision Trees, Random forests and Gradient Boosting
Machines
Decission trees: Decision Trees has Flowchart-like structures, it Classify input data or predict outputs
Advantages of Decision Trees: Easy to visualize and interpret, Simple and intuitive
Types of Decisions
There are two main types of decision trees that are based on the target variable, i.e., categorical variable
decision trees and continuous variable decision trees.
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
• Disadvantages: Also, decision trees can look very different depending on the question that it starts
with. If we created our decision tree with a different question in the beginning, the order of the
questions in the tree could look very different
• In summary, decision trees aren’t really that useful by themselves despite being easy to build. It isn’t
ideal to have just a single decision tree as a general model to make predictions with
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
Random Forest:
The Random Forest algorithm introduced a robust, practical take on decision-tree.
1. Random forests are applicable to a wide range of problems—they’re almost always the
second-best algorithm for any shallow machine-learning task.
2. When the popular machine-learning competition website Kaggle (http://kaggle.com) got
started in 2010, random forests quickly became a favorite on the platform—until 2014, This
continued until gradient boosting.
3. As the name suggests, random forests builds a bunch of decision trees independently. Each decision
tree is a simple predictor but their results are aggregated into a single result. This, in theory, should be
closer to the true result that we’re looking for via collective intelligence
4. As I mentioned previously, each decision tree can look very different depending on the data; a random
forest will randomise the construction of decision trees to try and get a variety of different predictions.
5. A disadvantage of random forests is that they’re harder to interpret than a single decision tree.
They’re also slower to build since random forests need to build and evaluate each decision tree
independently.
• Bothrandom forests and gradient boosting machines use weak prediction models, like decision
trees, in a clever way to make better predictions.
• Gradient boosting machines improve upon the weak points of previous models by creating new
specialized models through a process called gradient boosting.
• When applied to decision trees, gradient boosting usually outperforms random forests, though they
share similar properties.
• It's considered one of the best techniques for dealing with non-perceptual data now adays.
• Along with deep learning, it's one of the most commonly used techniques in Kaggle competitions.
1. Supervised learning:
1) It consists of learning to map input data to known targets (also called annotations), given a
set of examples (often annotated by humans).
2) Generally, almost all applications of deep learning which are discussing in these days
belong in this category, such as optical character recognition, speech recognition, image
classification, and language translation
3) Decision tree comes under Supervised Learning, in training the model, the learning will be
done by considering the target class or dependent variable, for example “Issue the loan-
Yes or No” or “Heart disease- Yes or No”.
4) Although supervised learning mostly consists of classification and regression, there are more
exotic variants as well, including the following (with examples):
Ex1. Sequence generation—Given a picture, predict a caption describing it. Sequence generation
can sometimes be reformulated as a series of classification problems (such as repeatedly predicting
a word or token in a sequence).
Ex2. Syntax tree prediction—Given a sentence, predict its decomposition into a syntax tree.
Ex 3. Object detection—Given a picture, draw a bounding box around certain objects inside the
picture. This can also be expressed as a classification problem (given many candidate bounding
boxes, classify the contents of each one) or as a joint classification and regression problem, where
the bounding-box coordinates are predicted via vector regression.
2. Unsupervised learning:
Clustering:
1. Objective: Clustering algorithms aim to group similar data points together into
clusters or segments based on their inherent similarities.
2. Use Cases: Clustering is commonly used for tasks such as customer segmentation,
image segmentation, document clustering, and anomaly detection. Popular
algorithms include K-Means, Hierarchical Clustering, and DBSCAN.
Dimensionality Reduction:
1. Objective: Dimensionality reduction techniques aim to reduce the number of
features (or dimensions) in the data while preserving its important characteristics.
This can help in simplifying the data and removing noise.
2. Use Cases: Dimensionality reduction is used for visualization, feature selection,
and reducing the computational complexity of models. Principal Component
Analysis (PCA) and t-SNE (t-Distributed Stochastic Neighbor Embedding) are
common techniques
3. Self-supervised learning:
Self-supervised learning is a machine learning paradigm that falls under the broader
category of unsupervised learning.
In self-supervised learning, the algorithm learns from the data itself, but unlike
traditional unsupervised learning, it does so by creating its own supervision signals or
labels from the input data, rather than relying on external labels provided by humans.
Autoencoders:
a. Predicting the next frame in a video sequence, given past frames, is another instance
of self-supervised learning.
b. The model learns to understand the temporal dependencies in video data by
predicting the future frame.
c. Here, supervision comes from future input data (i.e., the next frame).
4. Reinforcement learning:
Q6. Is it possible to evaluate the machine learning models, Explain Simple holdout and k-fold cross
validation
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
• Yes, it is possible to evaluate machine learning models to assess their performance and
generalization capabilities. Two commonly used techniques for model evaluation are simple
holdout validation and k-fold cross-validation.
• Simple holdout validation and k-fold cross-validation are methods for evaluating machine
learning models. Simple holdout is straightforward but can be sensitive to the initial random
split.
• K-fold cross-validation is more robust, providing a better estimate of model performance,
but it's computationally more intensive.
1. Simple Holdout Validation:
• Concept: In simple holdout validation, the dataset is divided into two subsets: a training set
and a testing set.
• Usage: The training set is used to train the machine learning model.
The testing set is used to evaluate the model's performance and estimate how well it
will generalize to unseen data.
• Process: Typically, a random portion of the dataset (e.g., 70-80%) is used for training, while
the remaining portion (e.g., 20-30%) is used for testing.
The model is trained on the training set, and its performance is assessed on the testing set using
2. k-fold cross-validation
• Concept: In k-fold cross-validation, the dataset is divided into k equally sized folds (subsets).
• Usage: The model is trained and tested k times, where each fold serves as the testing set once,
and the remaining k-1 folds are used for training in each iteration.
• Process: The process repeats k times, each time with a different fold as the test set and the
others as the training set.The evaluation metrics are averaged over the k iterations to obtain a
more robust estimate of model performance.
DR DAVID SEP 2023 DEEP LEARNING TECHNIQUES UNIT I IV CSE
Data Utilization: It uses the entire dataset for both training and testing, which can be especially
valuable with limited data.
Cons:
Computationally Intensive: It's more computationally demanding than simple holdout validation, as
the model is trained and tested k times.
Variability: While reducing variability, the results can still vary depending on how the data is
split.
***************
The tongue of the wise adorns knowledge, but the mouth of the
fool gushes folly