DL Unit 1
DL Unit 1
In the real world, we are surrounded by humans who can learn everything
from their experiences with their learning capability, and we have
computers or machines which work on our instructions. But can a machine
also learn from experiences or past data like a human does? So here
comes the role of Machine Learning.
……………… end…………
Digital assistants
Voice-activated television remotes
Fraud detection
Automatic facial recognition
…………………….. end…………….
Q)probabilistic modeling ?
Generative models:
Generative models aim to model the joint distribution of the input and
output variables. These models generate new data based on the
probability distribution of the original dataset. Generative models are
powerful because they can generate new data that resembles the training
data. They can be used for tasks such as image and speech
synthesis, language translation, and text generation.
Discriminative models:
The discriminative model aims to model the conditional distribution of the
output variable given the input variable. They learn a decision boundary
that separates the different classes of the output variable. Discriminative
models are useful when the focus is on making accurate predictions rather
than generating new data. They can be used for tasks such as image
recognition, speech recognition, and sentiment analysis.
Graphical models:
These models use graphical representations to show the conditional
dependence between variables. They are commonly used for tasks such
as image recognition, natural language processing, and causal inference.
Deep learning, a subset of machine learning, also relies on probabilistic
models. Probabilistic models are used to optimize complex models with
many parameters, such as neural networks. By incorporating uncertainty
into the model training process, deep learning algorithms can provide
higher accuracy and generalization capabilities. One popular technique is
variational inference, which allows for efficient estimation of posterior
distributions.
…………………………………. END……………..
Q) Artificial Intelligence?
Artificial Intelligence exists when a machine can have human based skills
such as learning, reasoning, and solving problems
It is believed that AI is not a new technology, and some people says that as
per Greek myth, there were Mechanical men in early days which can work
and behave like humans.
o With the help of AI, you can create such software or devices which
can solve real-world problems very easily and with accuracy such as
health issues, marketing, traffic issues, etc.
o With the help of AI, you can create your personal virtual Assistant,
such as Cortana, Google Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an
environment where survival of humans can be at risk.
o AI opens a path for other new technologies, new devices, and new
Opportunities.
Q)kernel methods?
Kernel Perception
Spectral Clustering
It’s more difficult to imagine how we can separate the data linearly and the
decision boundary. In p-dimensions, a hyperplane is a p-1 dimensional
“flat” subspace within the larger p-dimensional space. The hyperplane is
simply a line in two dimensions.
2. Adaptive Filter:
3. Kernel perception:
3D reconstruction
Bioinformatics
Geostatistics
Chemoinformatics
Handwriting recognition
Information extraction
4. Principle Component Analysis (PCA):
The second main part is orthogonal in the main part and captures the
remaining variations, the rest of the first main part, and so on. Many
principal components are uncorrelated and organized so that a few
principal components define most of the actual data variations. The kernel
principal component analysis extends PCA that uses kernel methods. In
contrast to the standard linear PCA, the kernel variant works for a large
number of attributes but becomes slow for a large number of examples.
5. Spectral clustering:
Its roots can be traced back to graph theory, where this method is used to
identify node communities on a graph depending on the edges that connect
them. This method is sufficiently adaptable to allow us to compile data from
non-graphs too.
…………………………………… END………………………
Q) Random forests?
Random Forest is a popular machine learning algorithm that belongs to the
supervised learning technique. It can be used for both Classification and
Regression problems in ML. It is based on the concept of ensemble
learning, which is a process of combining multiple classifiers to solve a
complex problem and to improve the performance of the model.
The greater number of trees in the forest leads to higher accuracy and
prevents the problem of overfitting.
The below diagram explains the working of the Random Forest algorithm:
Note: To better understand the Random Forest Algorithm, you should
have knowledge of the Decision Tree Algorithm.
Since the random forest combines multiple trees to predict the class of the
dataset, it is possible that some decision trees may predict the correct
output, while others may not. But together, all the trees predict the correct
output. Therefore, below are two assumptions for a better Random forest
classifier:
Below are some points that explain why we should use the Random Forest
algorithm:
<="" li="">
The Working process can be explained in the below steps and diagram:
Step-3: Choose the number N for decision trees that you want to build.
Step-5: For new data points, find the predictions of each decision tree, and
assign the new data points to the category that wins the majority votes.
There are mainly four sectors where Random forest mostly used:
Q) Decision tree?
Parent/Child node: The root node of the tree is called the parent node,
and other nodes are called the child nodes.
In a decision tree, for predicting the class of the given dataset, the
algorithm starts from the root node of the tree. This algorithm compares the
values of root attribute with the record (real dataset) attribute and, based on
the comparison, follows the branch and jumps to the next node.
For the next node, the algorithm again compares the attribute value with
the other sub-nodes and move further. It continues the process until it
reaches the leaf node of the tree. The complete process can be better
understood using the below algorithm:
o Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute
Selection Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the
best attributes.
o Step-4: Generate the decision tree node, which contains the best
attribute.
o Step-5: Recursively make new decision trees using the subsets of
the dataset created in step -3. Continue this process until a stage is
reached where you cannot further classify the nodes and called the
final node as a leaf node.
Example: Suppose there is a candidate who has a job offer and wants to
decide whether he should accept the offer or Not. So, to solve this problem,
the decision tree starts with the root node (Salary attribute by ASM). The
root node splits further into the next decision node (distance from the office)
and one leaf node based on the corresponding labels. The next decision
node further gets split into one decision node (Cab facility) and one leaf
node. Finally, the decision node splits into two leaf nodes (Accepted offers
and Declined offer). Consider the below diagram:
While implementing a Decision tree, the main issue arises that how to
select the best attribute for the root node and for sub-nodes. So, to solve
such problems there is a technique which is called as Attribute selection
measure or ASM. By this measurement, we can easily select the best
attribute for the nodes of the tree. There are two popular techniques for
ASM, which are:
o Information Gain
o Gini Index
1. Information Gain:
o Information gain is the measurement of changes in entropy after the
segmentation of a dataset based on an attribute.
o It calculates how much information a feature provides us about a
class.
o According to the value of information gain, we split the node and build
the decision tree.
o A decision tree algorithm always tries to maximize the value of
information gain, and a node/attribute having the highest information
gain is split first. It can be calculated using the below formula:
2. Gini Index:
o Gini index is a measure of impurity or purity used while creating a
decision tree in the CART(Classification and Regression Tree)
algorithm.
o An attribute with the low Gini index should be preferred as compared
to the high Gini index.
o It only creates binary splits, and the CART algorithm uses the Gini
index to create binary splits.
o Gini index can be calculated using the below formula:
Advantages:
Disdvantages:
1950 – this is the year when Alan Turing, one of the most brilliant and influential
British mathematicians and computer scientists, created the Turing test. The test
was designed to determine whether a computer has human-like intelligence. In order
to pass the test, the computer needs to be able to convince a human to believe that
it’s another human. Apart from a computer program simulating a 13-year-old
Ukrainian boy who is said to have passed the Turing test, there were no other
successful attempts so far.
1952 – Arthur Samuels, the American pioneer in the field of artificial intelligence and
computer gaming, wrote the very first computer learning program. That program was
actually the game of checkers. The IBM computer would first study which moves
lead to winning and then put them into its program.
1957 – this year witnessed the design of the very first neural network for computers
called the perceptron by Frank Rosenblatt. It successfully stimulated the thought
processes of the human brain. This is where today’s neural networks originate from.
1967 – The nearest neighbor algorithm was written for the first time this year. It
allows computers to start using basic pattern recognition. This algorithm can be
used to map a route for a traveling salesman that starts in a random city and
ensures that the salesman passes by all the required cities in the shortest time.
Today, the nearest neighbor algorithm called KNN is mostly used to classify a data
point on the basis of how their neighbors are classified. KNN is used in retail
applications that recognize patterns in credit card usage or for theft prevention when
implemented in CCTV image recognition in retail stores.
1985 – Terry Sejnowski invented the NetTalk program that could learn to pronounce
words just like a baby does during the process of language acquisition. The artificial
neural network aimed to reconstruct a simplified model that would show the
complexity of learning human-level cognitive tasks.
The 1990s – during the 1990s, the work in machine learning shifted from the
knowledge-driven approach to the data-driven approach. Scientists and researchers
created programs for computers that could analyze large amounts of data and draw
conclusions from the results. This led to the development of the IBM Deep Blue
computer, which won against the world’s chess champion Garry Kasparov in 1997.
2006 – this is the year when the term “deep learning” was coined by Geoffrey
Hinton. He used the term to explain a brand-new type of algorithms that allow
computers to see and distinguish objects or text in images or videos.
2010 – this year saw the introduction of Microsoft Kinect that could track even 20
human features at the rate of 30 times per second. Microsoft Kinect allowed users to
interact with machines via gestures and movements.
2011 – this was an interesting year for machine learning. For starters, IBM’s Watson
managed to beat human competitors at Jeopardy. Moreover, Google developed
Google Brain equipped with a deep neural network that could learn to discover and
categorize objects (in particular, cats).
2015 – this is the year when Amazon launched its own machine learning platform,
making machine learning more accessible and bringing it to the forefront of software
development. Moreover, Microsoft created the Distributed Machine Learning Toolkit,
which enables developers to efficiently distribute machine learning problems across
multiple machines. During the same year, however, more than three thousand AI
and robotics researchers endorsed by figures like Elon Musk, Stephen Hawking, and
Steve Wozniak signed an open letter warning about the dangers of autonomous
weapons that could select targets without any human intervention.
2016 – this was the year when Google’s artificial intelligence algorithms managed to
beat a professional player at the Chinese board game Go. Go is considered the
world’s most complex board game. The AlphaGo algorithm developed by Google
won five out of five games in the competition, bringing AI to the front page.