0% found this document useful (0 votes)

70 views

ML UNIT II

Uploaded by

T.Ramakrishna JITS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views

ML UNIT II

Uploaded by

T.Ramakrishna JITS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

UNIT-II

Supervised Learning:
Supervised learning is the types of machine learning in which machines are trained using well
"labelled" training data, and on basis of that data, machines predict the output. The labelled data
means some input data is already tagged with the correct output.
In supervised learning, the training data provided to the machines work as the supervisor that
teaches the machines to predict the output correctly. It applies the same concept as a student learns
in the supervision of the teacher.
Supervised learning can be further divided into two types of problems:
1. Regression
2. Classification

Regression:
Regression algorithms are used if there is a relationship between the input variable and the output variable.
Types of Regression Algorithms:

o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression

Classification:
Classification algorithms are used when the output variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.
Types of Classification Algorithms:

o Decision Trees
o Logistic Regression
o Support vector Machines

Regression Algorithms:
Linear regression is one of the easiest and most popular Machine Learning algorithms.
It is a statistical method that is used for predictive analysis.
Linear regression makes predictions for continuous/real or numeric variables such as sales, salary,
age, product price, etc.

Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression.
Regression analysis is a form of predictive modeling technique which investigates the relationship
between a dependent and independent variable.
Linear Regression in Machine Learning
Linear Regression is a simple machine learning model for regression problems, i.e., when the target
variable is a real value.

Linear regression is a linear model, e.g. a model that assumes a linear relationship between the
input variables (x) and the single output variable (y). More specifically, that y can be calculated
from a linear combination of the input variables (x).

= Dependent Variable (Target Variable)

= Independent Variable (predictor Variable)

= intercept of the line (Gives an additional degree of freedom)

= Linear regression coefficient (scale

factor to each input value).

= random error

The values for x and y variables are training datasets for Linear Regression model representation.

Types of Linear Regression

Linear regression can be further divided into two types of the algorithm:

o Simple Linear Regression:

If a single independent variable is used to predict the value of a numerical dependent variable,
then such a Linear Regression algorithm is called Simple Linear Regression.

o Multiple Linear regression:

If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.

Simple linear regression

In simple linear regression, we establish a relationship between target variable and input variables
by fitting a line, known as the regression line.
In general, a line can be represented by linear equation y = m * X + b. Where, y is the dependent
variable, X is the independent variable, m is the slope, b is the intercept.
In machine learning, we rewrite our equation as y(x) = w0+ w1 * x where w’s are the parameters of
the model, x is the input, and y is the target variable. Different values of w0 and w1 will give us
different lines.

Multiple linear regression

The above equation can be used when we have one input variable (also called feature). However, in
general, we usually deal with datasets which have multiple input variables. The case when we have
more than one feature is known as multiple linear regression, or simply, linear regression. We can
generalize our previous equation for simple linear regression to multiple linear regression.
y(x) = w0+ w1 x1+ w2 x2+ w3 x3+ w4 x4+….. + wn xn.

Finding the best fit line:

When working with linear regression, our main goal is to find the best fit line that means the error
between predicted values and actual values should be minimized. The best fit line will have the
least error.

The different values for weights or the coefficient of lines (β 0, β1) gives a different line of
regression, so we need to calculate the best values for a0 and a1 to find the best fit line, so to
calculate this we use cost function.

Cost function-
The different values for weights or coefficient of lines (β0, β1) gives the different line of
regression, and the cost function is used to estimate the values of the coefficient for the best fit
line.

Cost function optimizes the regression coefficients or weights. It measures how a linear regression
model is performing.

We can use the cost function to find the accuracy of the mapping function, which maps the input
variable to the output variable. This mapping function is also known as Hypothesis function.

For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the
average of squared error occurred between the predicted values and actual values. It can be written
as:

Residuals: The distance between the actual value and predicted values is called residual. If the
observed points are far from the regression line, then the residual will be high, and so cost function
will high. If the scatter points are close to the regression line, then the residual will be small and
hence the cost function.

Estimating Error
We can calculate a error for our predictions called the Root Mean Squared Error or RMSE.
RMSE = sqrt( sum( (pi – yi)^2 )/n )
p is the predicted value and y is the actual value, i is the index for a specific instance, n is the
number of predictions, because we must calculate the error across all predicted values.

Example:
y = B0 + B1 * x
Data:
x y
1 1
2 3
4 3
3 2
5 5
Calculating mean values of x:
x mean(x) x - mean(x)
1 3 -2
2 3 -1
4 3 1
3 3 0
5 3 2
Calculating mean values of y:
y mean(y) y - mean(y)
1 2.8 -1.8
3 2.8 0.2
3 2.8 0.2
2 2.8 -0.8
5 2.8 2.2

x - mean(x) y - mean(y) Multiplication x - mean(x) squared

-2 -1.8 3.6 -2 4
-1 0.2 -0.2 -1 1
1 0.2 0.2 1 1
0 -0.8 0 0 0
2 2.2 4.4 2 4
8 10

B1 =
–

B0= mean(y)-B1*mean(x)=0.4
y = 0.4 + 0.8 * x
x y predicted y
1 1 1.2
2 3 2
4 3 3.6
3 2 2.8
5 5 4.4
RMSE = sqrt( sum( (pi – yi)^2 )/n )
predicted squared
y error(difference)
y error
1.2 1 0.2 0.04
2 3 -1 1
3.6 3 0.6 0.36
2.8 2 0.8 0.64
4.4 5 -0.6 0.36
2.4
RMSE SQRT(2.4/5) 0.69282

Each prediction is on average wrong by about 0.692 units.

Classification:

Classification is a process of finding a function which helps in dividing the dataset into classes
based on different parameters.

In Classification, a computer program is trained on the training dataset and based on that training,
it categorizes the data into different classes.

The task of the classification algorithm is to find the mapping function to map the input(x) to the discrete
output(y).

The Classification algorithm is a Supervised Learning technique that is used to identify the category of new
observations on the basis of training data.

Types of ML Classification Algorithms:

Classification Algorithms can be further divided into the mainly two categories:

o Linear Models
o Logistic Regression
o Support Vector Machines

o Non-linear Models
o K-Nearest Neighbours
o Kernel SVM
o Naïve Bayes
o Decision Tree Classification
o Random Forest Classification
Naïve Bayes

What is Naive Bayes algorithm?

It is a classification technique based on Bayes’ Theorem with an assumption of independence
among predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a
particular feature in a class is unrelated to the presence of any other feature.
For example, a fruit may be considered to be an apple if it is red, round, and about 3 inches in
diameter. Even if these features depend on each other or upon the existence of the other features,
all of these properties independently contribute to the probability that this fruit is an apple and that
is why it is known as ‘Naive’.
Naive Bayes model is easy to build and particularly useful for very large data sets. Along with
simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.
Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x) and
P(x|c). Look at the equation below:

 P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).
 P(c) is the prior probability of class.
 P(x|c) is the likelihood which is the probability of predictor given class.
 P(x) is the prior probability of predictor.

How Naive Bayes algorithm works?

Let’s understand it using an example. Below I have a training data set of weather and
corresponding target variable ‘Play’ (suggesting possibilities of playing). Now, we need to classify
whether players will play or not based on weather condition. Let’s follow the below steps to
perform it.
Step 1: Convert the data set into a frequency table
Step 2: Create Likelihood table by finding the probabilities like Overcast probability = 0.29 and
probability of playing is 0.64.
Step 3: Now, use Naive Bayesian equation to calculate the posterior probability for each class. The
class with the highest posterior probability is the outcome of prediction.

Problem: Players will play if weather is sunny. Is this statement is correct?

We can solve it using above discussed method of posterior probability.
P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)
Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64
Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
Naive Bayes uses a similar method to predict the probability of different class based on various
attributes. This algorithm is mostly used in text classification and with problems having multiple
classes.
What are the Pros and Cons of Naive Bayes?

Pros:
 It is easy and fast to predict class of test data set. It also perform well in multi class
prediction
 When assumption of independence holds, a Naive Bayes classifier performs better
compare to other models like logistic regression and you need less training data.
 It perform well in case of categorical input variables compared to numerical variable(s).
For numerical variable, normal distribution is assumed (bell curve, which is a strong assumption).
Cons:
 If categorical variable has a category (in test data set), which was not observed in training
data set, then model will assign a 0 (zero) probability and will be unable to make a prediction.
This is often known as “Zero Frequency”. To solve this, we can use the smoothing technique. One
of the simplest smoothing techniques is called Laplace estimation.
 On the other side naive Bayes is also known as a bad estimator, so the probability outputs
from predict_proba are not to be taken too seriously.
 Another limitation of Naive Bayes is the assumption of independent predictors. In real life,
it is almost impossible that we get a set of predictors which are completely independent.

Applications of Naive Bayes Algorithms

 Real time Prediction: Naive Bayes is an eager learning classifier and it is sure fast. Thus, it
could be used for making predictions in real time.
 Multi class Prediction: This algorithm is also well known for multi class prediction feature.
Here we can predict the probability of multiple classes of target variable.
 Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayes classifiers mostly
used in text classification (due to better result in multi class problems and independence rule) have
higher success rate as compared to other algorithms. As a result, it is widely used in Spam filtering
(identify spam e-mail) and Sentiment Analysis (in social media analysis, to identify positive and
negative customer sentiments)
 Recommendation System: Naive Bayes Classifier and Collaborative Filtering together
builds a Recommendation System that uses machine learning and data mining techniques to filter
unseen information and predict whether a user would like a given resource or not.

Discriminant Functions -Probabilistic Generative Models -Probabilistic Discriminative

Models :

Some machine learning models belong to either the “generative” or “discriminative” model
categories. Yet what is the difference between these two categories of models? What does it mean
for a model to be discriminative or generative?

The short answer is that generative models are those that include the distribution of the data set,
returning a probability for a given example. Generative models are often used to predict what
occurs next in a sequence.

Meanwhile, discriminative models are used for either classification or regression and they return a
prediction based on conditional probability.

Generative vs. Discriminative Models

There are a variety of ways to categorize a machine learning model. A model can be classified as
belonging to different categories like: generative models, discriminative models, parametric
models, non-parametric models, tree-based models, non-tree-based models.

Generative Models
Generative models are those that center on the distribution of the classes within the dataset.

The machine learning algorithms typically model the distribution of the data points.

Generative models rely on finding joint probability. Creating points where a given input feature
and a desired output/label exist concurrently.

Generative models are typically employed to estimate probabilities and likelihood, modeling data
points and discriminating between classes based on these probabilities. Because the model learns a
probability distribution for the dataset, it can reference this probability distribution to generate
new data instances.

Generative models often rely on Bayes theorem to find the joint probability, finding p(x,y).
Essentially, generative models model how the data was generated, answer the following question:

“What’s the likelihood that this class or another class generated this data point/instance?”

Examples of generative machine learning models include Linear Discriminant Analysis (LDA),
Hidden Markov models, and Bayesian networks like Naive Bayes.
Discriminative Models
While generative models learn about the distribution of the dataset, discriminative models learn
about the boundary between classes within a dataset.

With discriminative models, the goal is to identify the decision boundary between classes to apply
reliable class labels to data instances. Discriminative models separate the classes in the dataset by
using conditional probability, not making any assumptions about individual data points.

Discriminative models set out to answer the following question:

“What side of the decision boundary is this instance found in?”

Examples of discriminative models in machine learning include support vector machines, logistic
regression, decision trees, and random forests.

Differences between Generative and Discriminative

The major differences between generative and discriminative models.

Generative models:

 Generative models aim to capture the actual distribution of the classes in the dataset.
 Generative models predict the joint probability distribution – p(x,y) – utilizing Bayes
Theorem.
 Generative models are computationally expensive compared to discriminative models.
 Generative models are useful for unsupervised machine learning tasks.
 Generative models are impacted by the presence of outliers more than discriminative
models.
Discriminative models:

 Discriminative models model the decision boundary for the dataset classes.
 Discriminative models learn the conditional probability – p(y|x).
 Discriminative models are computationally cheap compared to generative models.
 Discriminative models are useful for supervised machine learning tasks.
 Discriminative models have the advantage of being more robust to outliers, unlike the
generative models.
 Discriminative models are more robust to outliers compared to generative models.

Logistic Regression:
o Logistic regression is another supervised learning algorithm which is used to solve the
classification problems. In classification problems, we have dependent variables in a binary or
discrete format such as 0 or 1.
o Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes or
No, True or False, Spam or not spam, etc.
o It is a predictive analysis algorithm which works on the concept of probability.
o Logistic regression is a type of regression, but it is different from the linear regression
algorithm in the term how they are used.
o Logistic regression uses sigmoid function or logistic function which is a complex cost
function. This sigmoid function is used to model the data in logistic regression. The function can
be represented as:

f(x)= Output between the 0 and 1 value.

x= input to the function
e= base of natural logarithm.

When we provide the input values (data) to the function, it gives the S-curve as follows:

o It uses the concept of threshold levels, values above the threshold level are rounded up
to 1, and values below the threshold level are rounded up to 0.

There are three types of logistic regression:

o Binary(0/1, pass/fail)
o Multi(cats, dogs, lions)
o Ordinal(low, medium, high)

Example:
This dataset has two input variables (X1 and X2) and one output variable (Y). In input variables
are real-valued random numbers drawn from a Gaussian distribution. The output variable has two
values, making the problem a binary classification problem
X1 X2 Y
2.7810836 2.550537003 0
1.465489372 2.362125076 0
3.396561688 4.400293529 0
1.38807019 1.850220317 0
3.06407232 3.005305973 0
7.627531214 2.759262235 1
5.332441248 2.088626775 1
6.922596716 1.77106367 1
8.675418651 -0.2420686549 1
7.673756466 3.508563011 1

Logistic Function
Before we dive into logistic regression, let’s take a look at the logistic function, the heart of the
logistic regression technique.
The logistic function is defined as:
transformed = 1 / (1 + e^-x)
Where e is the numerical constant Euler’s number and x is a input we plug into the function.
Let’s plug in a series of numbers from -5 to +5 and see how the logistic function transforms them:
X Transformed
-5 0.006692850924
-4 0.01798620996
-3 0.04742587318
-2 0.119202922
-1 0.2689414214
0 0.5
1 0.7310585786
2 0.880797078
3 0.9525741268
4 0.98201379
5 0.9933071491

You can see that all of the inputs have been transformed into the range [0, 1] and that the smallest
negative numbers resulted in values close to zero and the larger positive numbers resulted in
values close to one. You can also see that 0 transformed to 0.5 or the midpoint of the new range.
From this we can see that as long as our mean value is zero, we can plug in positive and negative
values into the function and always get out a consistent transform into the new range.

Logistic Regression Model

The logistic regression model takes real-valued inputs and makes a prediction as to the probability
of the input belonging to the default class (class 0).
If the probability is > 0.5 we can take the output as a prediction for the default class (class 0),
otherwise the prediction is for the other class (class 1).
For this dataset, the logistic regression has three coefficients just like linear regression, for
example:
output = b0 + b1*x1 + b2*x2
The job of the learning algorithm will be to discover the best values for the coefficients (b0, b1
and b2) based on the training data.
Unlike linear regression, the output is transformed into a probability using the logistic function:
p(class=0) = 1 / (1 + e^(-output))

Logistic Regression by Stochastic Gradient Descent

We can estimate the values of the coefficients using stochastic gradient descent.
This is a simple procedure that can be used by many algorithms in machine learning. It works by
using the model to calculate a prediction for each instance in the training set and calculating the
error for each prediction.
We can apply stochastic gradient descent to the problem of finding the coefficients for the logistic
regression model as follows:
Given each training instance:
1. Calculate a prediction using the current values of the coefficients.
2. Calculate new coefficient values based on the error in the prediction.
The process is repeated until the model is accurate enough (e.g. error drops to some desirable
level) or for a fixed number iterations. You continue to update the model for training instances and
correcting errors until the model is accurate enough orc cannot be made any more accurate. It is
often a good idea to randomize the order of the training instances shown to the model to mix up
the corrections made.
By updating the model for each training pattern we call this online learning. It is also possible to
collect up all of the changes to the model over all training instances and make one large update.
This variation is called batch learning and might make a nice extension to this tutorial if you’re
feeling adventurous.
Calculate Prediction
Let’s start off by assigning 0.0 to each coefficient and calculating the probability of the first
training instance that belongs to class 0.
B0 = 0.0
B1 = 0.0
B2 = 0.0
The first training instance is: x1=2.7810836, x2=2.550537003, Y=0
Using the above equation we can plug in all of these numbers and calculate a prediction:
prediction = 1 / (1 + e^(-(b0 + b1*x1 + b2*x2)))
prediction = 1 / (1 + e^(-(0.0 + 0.0*2.7810836 + 0.0*2.550537003)))
prediction = 0.5

Calculate New Coefficients

We can calculate the new coefficient values using a simple update equation.
b = b + alpha * (y – prediction) * prediction * (1 – prediction) * x
Where b is the coefficient we are updating and prediction is the output of making a prediction
using the model.
Alpha is parameter that you must specify at the beginning of the training run. This is the learning
rate and controls how much the coefficients (and therefore the model) changes or learns each time
it is updated. Larger learning rates are used in online learning (when we update the model for each
training instance). Good values might be in the range 0.1 to 0.3. Let’s use a value of 0.3.
You will notice that the last term in the equation is x, this is the input value for the coefficient.
You will notice that the B0 does not have an input. This coefficient is often called the bias or the
intercept and we can assume it always has an input value of 1.0. This assumption can help when
implementing the algorithm using vectors or arrays.
Let’s update the coefficients using the prediction (0.5) and coefficient values (0.0) from the
previous section.
b0 = b0 + 0.3 * (0 – 0.5) * 0.5 * (1 – 0.5) * 1.0
b1 = b1 + 0.3 * (0 – 0.5) * 0.5 * (1 – 0.5) * 2.7810836
b2 = b2 + 0.3 * (0 – 0.5) * 0.5 * (1 – 0.5) * 2.550537003
or
b0 = -0.0375
b1 = -0.104290635
b2 = -0.09564513761

Repeat the Process

We can repeat this process and update the model for each training instance in the dataset.
A single iteration through the training dataset is called an epoch. It is common to repeat the
stochastic gradient descent procedure for a fixed number of epochs.
At the end of epoch you can calculate error values for the model. Because this is a classification
problem, it would be nice to get an idea of how accurate the model is at each iteration.
You can see that the model very quickly achieves 100% accuracy on the training dataset.
The coefficients calculated after 10 epochs of stochastic gradient descent are:
b0 = -0.4066054641
b1 = 0.8525733164
b2 = -1.104746259
Make Predictions
Now that we have trained the model, we can use it to make predictions.
We can make predictions on the training dataset, but this could just as easily be new data.
Using the coefficients above learned after 10 epochs, we can calculate output values for each
training instance:
0.2987569857
0.145951056
0.08533326531
0.2197373144
0.2470590002
0.9547021348
0.8620341908
0.9717729051
0.9992954521
0.905489323

These are the probabilities of each instance belonging to class=0. We can convert these into crisp
class values using:
prediction = IF (output < 0.5) Then 0 Else 1
With this simple procedure we can convert all of the outputs to class values:
0
0
0
0
0
1
1
1
1
1
Finally, we can calculate the accuracy for the model on the training dataset:
accuracy = (correct predictions / num predictions made) * 100
accuracy = (10 /10) * 100
accuracy = 100%
Decision Tree
Introduction Decision Trees are a type of Supervised Machine Learning (that is you explain
what the input is and what the corresponding output is in the training data) where the data is
continuously split according to a certain parameter. The tree can be explained by two entities,
namely decision nodes and leaves. The leaves are the decisions or the final outcomes. And the
decision nodes are where the data is split.

An example of a decision tree can be explained using above binary tree. Let’s say you want to
predict whether a person is fit given their information like age, eating habit, and physical
activity, etc. The
decision nodes here are questions like ‘What’s the age?’, ‘Does he exercise?’, and ‘Does he eat a
lot of pizzas’? And the leaves, which are outcomes like either ‘fit’, or ‘unfit’. In this case this was
a binary classification problem (a yes no type problem).
There are two main types of Decision Trees:

1. Classification trees (Yes/No types)

What we have seen above is an example of classification tree, where the outcome was a
variable like ‘fit’ or ‘unfit’. Here the decision variable is Categorical.

2. Regression trees (Continuous data types)

Here the decision or the outcome variable is Continuous, e.g. a number like 123. Working Now
that we know what a Decision Tree is, we’ll see how it works internally. There are many
algorithms out there which construct Decision Trees, but one of the best is called as ID3
Algorithm. ID3 Stands for Iterative Dichotomiser 3. Before discussing the ID3 algorithm, we’ll
go through few definitions. Entropy Entropy, also called as Shannon Entropy is denoted by H(S)
for a finite set S, is the measure of the amount of uncertainty or randomness in data.

Classification and Regression Trees

Classification Trees

A classification tree is an algorithm where the target variable is fixed or categorical. The
algorithm is then used to identify the “class” within which a target variable would most likely fall.
An example of a classification-type problem would be determining who will or will not subscribe
to a digital platform; or who will or will not graduate from high school.
These are examples of simple binary classifications where the categorical dependent variable can
assume only one of two, mutually exclusive values. In other cases, you might have to predict
among a number of different variables. For instance, you may have to predict which type of
smartphone a consumer may decide to purchase.
In such cases, there are multiple values for the categorical dependent variable. Here’s what a
classic classification tree looks like

Classification trees are used when the dataset needs to be split into classes which belong to the
response variable. In many cases, the classes Yes or No.
In other words, they are just two and mutually exclusive. In some cases, there may be more than
two classes in which case a variant of the classification tree algorithm is used.
Regression Trees
A regression tree refers to an algorithm where the target variable is and the algorithm is used to
predict it’s value. As an example of a regression type problem, you may want to predict the
selling prices of a residential house, which is a continuous dependent variable.
This will depend on both continuous factors like square footage as well as categorical factors like
the style of home, area in which the property is located and so on.

Regression trees are used when the response variable is continuous. For instance, if the response
variable is something like the price of a property or the temperature of the day, a regression tree is
used.
In other words, regression trees are used for prediction-type problems while classification trees
are used for classification-type problems.

How Classification and Regression Trees Work

A classification tree splits the dataset based on the homogeneity of data. Say, for instance, there
are two variables; income and age; which determine whether or not a consumer will buy a
particular kind of phone.
If the training data shows that 95% of people who are older than 30 bought the phone, the data
gets split there and age becomes a top node in the tree. This split makes the data “95% pure”.
Measures of impurity like entropy or Gini index are used to quantify the homogeneity of the data
when it comes to classification trees.

What is a CART in Machine Learning?

A Classification and Regression Tree (CART) is a predictive algorithm used in machine learning.
It explains how a target variable’s values can be predicted based on other values.
It is a decision tree where each fork is a split in a predictor variable and each node at the end
has a prediction for the target variable.
The CART algorithm is an important decision tree algorithm that lies at the foundation of
machine learning. Moreover, it is also the basis for other powerful machine learning algorithms
like bagged decision trees, random forest and boosted decision trees.

Summing up
The Classification and regression tree (CART) methodology is one of the oldest and most
fundamental algorithms. It is used to predict outcomes based on certain predictor variables.
They are excellent for data mining tasks because they require very little data pre-processing.
Decision tree models are easy to understand and implement which gives them a strong advantage
when compared to other analytical models.

CART(Classification And Regression Trees) Decision Tree

An algorithm can be transparent only if its decisions can be read and understood by people clearly.
Even though deep learning is superstar of machine learning nowadays, it is an opaque algorithm
and we do not know the reason of decision. Herein, Decision tree algorithms still keep their
popularity because they can produce transparent decisions.
ID3 uses information gain whereas C4.5 uses gain ratio for splitting. Here, CART is an alternative
decision tree building algorithm. It can handle both classification and regression tasks. This
algorithm uses a new metric named gini index to create decision points for classification tasks.

Day Outlook Temp. Humidity Wind Decision

1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
3 Overcast Hot High Weak Yes
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes
11 Sunny Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Rain Mild High Strong No

Gini index
Gini index is a metric for classification tasks in CART. It stores sum of squared probabilities of
each class. We can formulate it as illustrated below.
Gini = 1 – Σ (Pi)2 for i=1 to number of classes

Outlook
Outlook is a nominal feature. It can be sunny, overcast or rain. I will summarize the final decisions
for outlook feature.

Outlook Yes No Number of instances

Sunny 2 3 5
Overcast 4 0 4
Rain 3 2 5
2 2
Gini(Outlook=Sunny) = 1 – (2/5) – (3/5) = 1 – 0.16 – 0.36 = 0.48
Gini(Outlook=Overcast) = 1 – (4/4)2 – (0/4)2 = 0
Gini(Outlook=Rain) = 1 – (3/5)2 – (2/5)2 = 1 – 0.36 – 0.16 = 0.48
Then, we will calculate weighted sum of gini indexes for outlook feature.
Gini(Outlook) = (5/14) x 0.48 + (4/14) x 0 + (5/14) x 0.48 = 0.171 + 0 + 0.171 = 0.342
Temperature
Similarly, temperature is a nominal feature and it could have 3 different values: Cool, Hot and
Mild. Let’s summarize decisions for temperature feature.
Temperature Yes No Number of instances
Hot 2 2 4
Cool 3 1 4
Mild 4 2 6
2 2
Gini(Temp=Hot) = 1 – (2/4) – (2/4) = 0.5
Gini(Temp=Cool) = 1 – (3/4)2 – (1/4)2 = 1 – 0.5625 – 0.0625 = 0.375
Gini(Temp=Mild) = 1 – (4/6)2 – (2/6)2 = 1 – 0.444 – 0.111 = 0.445
We’ll calculate weighted sum of gini index for temperature feature
Gini(Temp) = (4/14) x 0.5 + (4/14) x 0.375 + (6/14) x 0.445 = 0.142 + 0.107 + 0.190 = 0.439
Humidity
Humidity is a binary class feature. It can be high or normal.

Humidity Yes No Number of instances

High 3 4 7
Normal 6 1 7
Gini(Humidity=High) = 1 – (3/7)2 – (4/7)2 = 1 – 0.183 – 0.326 = 0.489
Gini(Humidity=Normal) = 1 – (6/7)2 – (1/7)2 = 1 – 0.734 – 0.02 = 0.244
Weighted sum for humidity feature will be calculated next
Gini(Humidity) = (7/14) x 0.489 + (7/14) x 0.244 = 0.367
Wind
Wind is a binary class similar to humidity. It can be weak and strong.
Wind Yes No Number of instances

Weak 6 2 8

Strong 3 3 6
Gini(Wind=Weak) = 1 – (6/8)2 – (2/8)2 = 1 – 0.5625 – 0.062 = 0.375
Gini(Wind=Strong) = 1 – (3/6)2 – (3/6)2 = 1 – 0.25 – 0.25 = 0.5
Gini(Wind) = (8/14) x 0.375 + (6/14) x 0.5 = 0.428
Time to decide
We’ve calculated gini index values for each feature. The winner will be outlook feature because
its cost is the lowest.
Feature Gini index

Outlook 0.342

Temperature 0.439

Humidity 0.367

Wind 0.428
We’ll put outlook decision at the top of the tree.
You might realize that sub dataset in the overcast leaf has only yes decisions. This means that
overcast leaf is over.
We will apply same principles to those sub datasets in the following steps.
Focus on the sub dataset for sunny outlook. We need to find the gini index scores for temperature,
humidity and wind features respectively.

Day Outlook Temp. Humidity Wind Decision

1 Sunny Hot High Weak No

2 Sunny Hot High Strong No

8 Sunny Mild High Weak No

9 Sunny Cool Normal Weak Yes

11 Sunny Mild Normal Strong Yes

Gini of temperature for sunny outlook
Temperature Yes No Number of instances

Hot 0 2 2

Cool 1 0 1

Mild 1 1 2
Gini(Outlook=Sunny and Temp.=Hot) = 1 – (0/2)2 – (2/2)2 = 0
Gini(Outlook=Sunny and Temp.=Cool) = 1 – (1/1)2 – (0/1)2 = 0
Gini(Outlook=Sunny and Temp.=Mild) = 1 – (1/2)2 – (1/2)2 = 1 – 0.25 – 0.25 = 0.5
Gini(Outlook=Sunny and Temp.) = (2/5)x0 + (1/5)x0 + (2/5)x0.5 = 0.2
Gini of humidity for sunny outlook

Humidity Yes No Number of instances

High 0 3 3

Normal 2 0 2
Gini(Outlook=Sunny and Humidity=High) = 1 – (0/3)2 – (3/3)2 = 0
Gini(Outlook=Sunny and Humidity=Normal) = 1 – (2/2)2 – (0/2)2 = 0
Gini(Outlook=Sunny and Humidity) = (3/5)x0 + (2/5)x0 = 0
Gini of wind for sunny outlook
Wind Yes No Number of instances

Weak 1 2 3

Strong 1 1 2
Gini(Outlook=Sunny and Wind=Weak) = 1 – (1/3)2 – (2/3)2 = 0.266
Gini(Outlook=Sunny and Wind=Strong) = 1- (1/2)2 – (1/2)2 = 0.2
Gini(Outlook=Sunny and Wind) = (3/5)x0.266 + (2/5)x0.2 = 0.466
Decision for sunny outlook
We’ve calculated gini index scores for feature when outlook is sunny. The winner is humidity
because it has the lowest value.
Feature Gini index

Temperature 0.2

Humidity 0

Wind 0.466
We’ll put humidity check at the extension of sunny outlook.

As seen, decision is always no for high humidity and sunny outlook. On the other hand, decision
will always be yes for normal humidity and sunny outlook. This branch is over.
Now, we need to focus on rain outlook.

Rain outlook

Day Outlook Temp. Humidity Wind Decision

4 Rain Mild High Weak Yes

5 Rain Cool Normal Weak Yes

6 Rain Cool Normal Strong No

10 Rain Mild Normal Weak Yes

14 Rain Mild High Strong No

We’ll calculate gini index scores for temperature, humidity and wind features when outlook is
rain.
Gini of temprature for rain outlook

Temperature Yes No Number of instances

Cool 1 1 2

Mild 2 1 3
Gini(Outlook=Rain and Temp.=Cool) = 1 – (1/2)2 – (1/2)2 = 0.5
Gini(Outlook=Rain and Temp.=Mild) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini(Outlook=Rain and Temp.) = (2/5)x0.5 + (3/5)x0.444 = 0.466
Gini of humidity for rain outlook

Humidity Yes No Number of instances

High 1 1 2

Normal 2 1 3
Gini(Outlook=Rain and Humidity=High) = 1 – (1/2)2 – (1/2)2 = 0.5
Gini(Outlook=Rain and Humidity=Normal) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini(Outlook=Rain and Humidity) = (2/5)x0.5 + (3/5)x0.444 = 0.466
Gini of wind for rain outlook

Wind Yes No Number of instances

Weak 3 0 3

Strong 0 2 2
Gini(Outlook=Rain and Wind=Weak) = 1 – (3/3)2 – (0/3)2 = 0
Gini(Outlook=Rain and Wind=Strong) = 1 – (0/2)2 – (2/2)2 = 0
Gini(Outlook=Rain and Wind) = (3/5)x0 + (2/5)x0 = 0
Decision for rain outlook
The winner is wind feature for rain outlook because it has the minimum gini index score in
features.
Feature Gini index

Temperature 0.466

Humidity 0.466

Wind 0
Put the wind feature for rain outlook branch and monitor the new sub data sets.

As seen, decision is always yes when wind is weak. On the other hand, decision is always no if
wind is strong. This means that this branch is over.
Artificial Neural Network
The term "Artificial Neural Network" is derived from Biological neural networks that develop
the structure of a human brain. Similar to the human brain that has neurons interconnected to one
another, artificial neural networks also have neurons that are interconnected to one another in
various layers of the networks. These neurons are known as nodes.

The typical Artificial Neural Network looks something like the given figure.

The basic unit of a neural network.

A neuron takes inputs, does some math with them, and produces one output. Here’s what a 2-
input neuron looks like:

First, each input is multiplied by a weight:

Next, all the weighted inputs are added together with a bias b:

(x1 w1 )+(x2 w2)+b

Finally, the sum is passed through an activation function:

y=f((x1 w1 )+(x2 w2 )+b)

The activation function is used to turn an unbounded input into an output that has a nice,
predictable form. A commonly used activation function is the sigmoid function.

Artificial Neural Network primarily consists of three layers:

Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the calculations to
find hidden features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally results in
output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.
It determines weighted total is passed as an input to an activation function to produce the output.
Activation functions choose whether a node should fire or not. Only those who are fired make it to
the output layer. There are distinctive activation functions available that can be applied upon the
sort of task we are performing.

Assume we have a 2-input neuron that uses the sigmoid activation function and has the following
parameters:

w=[0,1] b=4

w=[0,1] is just a way of writing w1=0,w2=1 in vector form. Now, let’s give the neuron an input
of x=[2,3]. We’ll use the dot product to write things more concisely:

(w⋅x)+b=((w1 x1)+(w2 x2))+b=0 2+1 3+4=7

y=f(w⋅x+b)=f(7)=0.999

The neuron outputs 0.999 given the inputs x=[2,3].

This process of passing inputs forward to get an output is known as feedforward.

Types of Artificial Neural Network:

There are various types of Artificial Neural Networks (ANN) depending upon the human brain
neuron and network functions, an artificial neural network similarly performs tasks. The majority
of the artificial neural networks will have some similarities with a more complex biological
partner and are very effective at their expected tasks. For example, segmentation or classification.

Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-evolved results
internally. As per the University of Massachusetts, Lowell Centre for Atmospheric Research.
The feedback networks feed information back into itself and are well suited to solve optimization
issues. The Internal system error corrections utilize feedback ANNs.

Feed-Forward ANN:
A feed-forward network is a basic neural network comprising of an input layer, an output layer,
and at least one layer of a neuron. Through assessment of its output by reviewing its input, the
intensity of the network can be noticed based on group behavior of the associated neurons, and the
output is decided. The primary advantage of this network is that it figures out how to evaluate and
recognize input patterns.

Support Vector Machines

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms,
which is used for Classification as well as Regression problems. However, primarily, it is used for
Classification problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put the new data point in the correct category
in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases
are called as support vectors, and hence algorithm is termed as Support Vector Machine.

Consider the below diagram in which there are two different categories that are classified
using a decision boundaryor hyperplane:

Example: SVM can be understood with the example that we have used in the KNN classifier.
Suppose we see a strange cat that also has some features of dogs, so if we want a model that can
accurately identify whether it is a cat or dog, so such a model can be created by using the SVM
algorithm. We will first train our model with lots of images of cats and dogs so that it can learn
about different features of cats and dogs, and then we test it with this strange creature. So as
support vector creates a decision boundary between these two data (cat and dog) and choose
extreme cases (support vectors), it will see the extreme case of cat and dog. On the basis of the
support vectors, it will classify it as a cat. Consider the below diagram:

SVM algorithm can be used for Face detection, image classification, text categorization, etc.
Types of SVM

SVM can be of two types:

o Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be
classified into two classes by using a single straight line, then such data is termed as linearly
separable data, and classifier is used called as Linear SVM classifier.

o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a
dataset cannot be classified by using a straight line, then such data is termed as non-linear data and
classifier used is called as Non-linear SVM classifier.

Hyperplane and Support Vectors in the SVM algorithm:

Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-
dimensional space, but we need to find out the best decision boundary that helps to classify the
data points. This best boundary is known as the hyperplane of SVM.
The dimensions of the hyperplane depend on the features present in the dataset, which means if
there are 2 features (as shown in image), then hyperplane will be a straight line. And if there are 3
features, then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the maximum distance
between the data points.

Support Vectors:
The data points or vectors that are the closest to the hyperplane and which affect the position of
the hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence
called a Support vector. How does SVM works?

Linear SVM:
The working of the SVM algorithm can be understood by using an example. Suppose we have a
dataset that has two tags (green and blue), and the dataset has two features x1 and x2. We want a
classifier that can classify the pair(x1, x2) of coordinates in either green or blue. Consider the
below image:

So as it is 2-d space so by just using a straight line, we can easily separate these two
classes. But there can be multiple lines that can separate these classes. Consider the below
image:

Hence, the SVM algorithm helps to find the best line or decision boundary; this best boundary or
region is called as a hyperplane. SVM algorithm finds the closest point of the lines from both
the classes. These points are called support vectors. The distance between the vectors and the
hyperplane is called as margin. And the goal of SVM is to maximize this margin. The
hyperplane with maximum margin is called the optimal hyperplane.

Non-Linear SVM:
If data is linearly arranged, then we can separate it by using a straight line, but for non-linear
data, we cannot draw a single straight line. Consider the below image:
So to separate these data points, we need to add one more dimension. For linear data,
we have used two dimensions x and y, so for non-linear data, we will add a third
dimension z. It can be calculated as: z=x2 +y2

By adding the third dimension, the sample space will become as below image:

So now, SVM will divide the datasets into classes in the following way. Consider the below
image:

Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it
in 2d space with z=1, then it will become as:
Hence we get a circumference of radius 1 in case of non-linear data.

Ensemble methods:
Ensemble learning is a machine learning paradigm where multiple models (often called “weak
learners”) are trained to solve the same problem and combined to get better results. The main
hypothesis is that when weak models are correctly combined, we can obtain more accurate and/or
robust models.

Ensemble methods take multiple small models and combine their predictions to obtain a more
powerful predictive power.

There are few very popular Ensemble techniques:

i. Boosting
ii. Bagging

1. Bagging :

Bagging is the type of ensemble technique in which a single training algorithm is used on different
subsets of the training data where the subset sampling is done with replacement (bootstrap). Once
the algorithm is trained on all the subsets, then bagging predicts by aggregating all the predictions
made by the algorithm on different subsets.

For aggregating the outputs of base learners, bagging uses majority voting (most frequent
prediction among all predictions) for classification and averaging (mean of all the predictions) for
regression.

Bagging visual representation:

Advantages of a Bagging Model:
1. Bagging significantly decreases the variance without increasing bias.

2. Bagging methods work so well because of diversity in the training data since the sampling is

done by bootstrapping.

3. Also, if the training set is very huge, it can save computational time by training the model on a

relatively smaller data set and still can increase the accuracy of the model.

4. Works well with small datasets as well.

· Disadvantages of a Bagging Model:

1. The main disadvantage of Bagging is that it improves the accuracy of the model at the expense

of interpretability i.e., if a single tree was being used as the base model, then it would have a more

attractive and easily interpretable diagram, but with the use of bagging this interpretability gets lost.

2. Another disadvantage of Bootstrap Aggregation is that during sampling, we cannot interpret

which features are being selected i.e., there are chances that some features are never used, which

may result in a loss of important information.

Out of Bag Evaluation: -In bagging, when different samples are collected, no sample contains all

the data but a fraction of the original dataset. There might be some data that are never sampled at

all. The remaining data which are not sampled are called out of bag instances.
The Random Forest approach is a bagging method where deep trees (Decision Trees), fitted on

bootstrap samples, are combined to produce an output with lower variance.

2.BOOSTING
· Boosting models fall inside this family of ensemble methods.

· Boosting, initially named Hypothesis Boosting, consists of the idea of filtering or weighting the

data that is used to train our team of weak learners, so that each new learner gives more weight or is

only trained with observations that have been poorly classified by the previous learners..

· By doing this team of models learns to make accurate predictions on all kinds of data, not just

on the most common or easy observations. Also, if one of the individual models is very bad at

making predictions on some kind of observation, it does not matter, as the other N-1 models will

most likely make up for it.

Definition: — The term ‘Boosting’ refers to a family of algorithms which converts weak learner to

strong learners. Boosting is an ensemble method for improving the model predictions of any given

learning algorithm. The idea of boosting is to train weak learners sequentially, each trying to

correct its predecessor. The weak learners are sequentially corrected by their predecessors and, in

the process, they are converted into strong learners.

Advantages of a Boosting Model:

1. Boosting is a resilient method that curbs over-fitting easily.

2. Provably effective

3. Versatile — can be applied to a wide variety of problems.

· Disadvantages of a Boosting Model:

1. A disadvantage of boosting is that it is sensitive to outliers since every classifier is obliged to fix

the errors in the predecessors. Thus, the method is too dependent on outliers.

2. Another disadvantage is that the method is almost impossible to scale up. This is because every

estimator bases its correctness on the previous predictors, thus making the procedure difficult to

streamline.

Ada Boost(Adaptive Boosting), Gradient Boosting, XG Boost(Xtreme Gradient Boosting) are few
common examples of Boosting Techniques.

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Module-3 Association Analysis: Data Mining Association Analysis: Basic Concepts and Algorithms
No ratings yet
Module-3 Association Analysis: Data Mining Association Analysis: Basic Concepts and Algorithms
34 pages
Kron Reduction Method in Power System
50% (2)
Kron Reduction Method in Power System
9 pages
Question Bank of Applied Machine Learning
No ratings yet
Question Bank of Applied Machine Learning
2 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Unit - 3
No ratings yet
Unit - 3
42 pages
Lec01 Conceptlearning
100% (1)
Lec01 Conceptlearning
49 pages
ML unit-2
100% (1)
ML unit-2
28 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
MC4102 OOSE Question bank
No ratings yet
MC4102 OOSE Question bank
4 pages
18AI61
No ratings yet
18AI61
3 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
ML Lab
No ratings yet
ML Lab
21 pages
Machine Learning QB
No ratings yet
Machine Learning QB
3 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
CS3251 Programming in C 2 Marks
No ratings yet
CS3251 Programming in C 2 Marks
23 pages
Constraint Satisfaction Problems: AIMA: Chapter 6
No ratings yet
Constraint Satisfaction Problems: AIMA: Chapter 6
64 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
CS3591 Computer Networks Lab manual finalized (3)
No ratings yet
CS3591 Computer Networks Lab manual finalized (3)
67 pages
Deep Learning Exp
No ratings yet
Deep Learning Exp
25 pages
Machine Learning Unit 5
No ratings yet
Machine Learning Unit 5
43 pages
Linear Regression Analysis. Statistics 2 Notes
No ratings yet
Linear Regression Analysis. Statistics 2 Notes
20 pages
CS6456-Object Oriented Programming
No ratings yet
CS6456-Object Oriented Programming
15 pages
CS3401-ALGORITHMS QB Original
No ratings yet
CS3401-ALGORITHMS QB Original
51 pages
CS402 Data Mining and Warehousing PDF
No ratings yet
CS402 Data Mining and Warehousing PDF
3 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
ML Lab Final R22
No ratings yet
ML Lab Final R22
67 pages
Question Bank 1to11
No ratings yet
Question Bank 1to11
19 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
SCSA3016 Data Science L T P Credits Total Marks 3 0 0 3 100
No ratings yet
SCSA3016 Data Science L T P Credits Total Marks 3 0 0 3 100
1 page
Machine Learning-4
No ratings yet
Machine Learning-4
18 pages
Data Mining and Business Intelligence Lab Manual
No ratings yet
Data Mining and Business Intelligence Lab Manual
52 pages
Question Bank AML
No ratings yet
Question Bank AML
4 pages
Unit 2a
No ratings yet
Unit 2a
31 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
Deep Learning Question Bank(2024-25)
No ratings yet
Deep Learning Question Bank(2024-25)
2 pages
ML-UNIT-5
No ratings yet
ML-UNIT-5
20 pages
Data Mining
No ratings yet
Data Mining
2 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
Dsf-Pyt-Lab Manual
No ratings yet
Dsf-Pyt-Lab Manual
50 pages
Unit -3-NNDL- Notes
No ratings yet
Unit -3-NNDL- Notes
17 pages
Bai602 Ml Lesson Plan 2024-25 Even Aiml Dept
No ratings yet
Bai602 Ml Lesson Plan 2024-25 Even Aiml Dept
5 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
NN UNIT-1 Complete Notes with 153 pages (1)
No ratings yet
NN UNIT-1 Complete Notes with 153 pages (1)
153 pages
KNN Solved Numerical Problem( Regression)
No ratings yet
KNN Solved Numerical Problem( Regression)
3 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
94 pages
CS 3 - Problem Solving Agent
No ratings yet
CS 3 - Problem Solving Agent
80 pages
Numpy - Tutorial - Ipynb - Colaboratory
No ratings yet
Numpy - Tutorial - Ipynb - Colaboratory
9 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
Instant Ebooks Textbook Cognitive Computing Theory and Applications 1st Edition Venkat N. Gudivada Download All Chapters
100% (6)
Instant Ebooks Textbook Cognitive Computing Theory and Applications 1st Edition Venkat N. Gudivada Download All Chapters
84 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
36 pages
ML Unit-4
No ratings yet
ML Unit-4
9 pages
Syllabus
No ratings yet
Syllabus
9 pages
ML Question Bank 2024
No ratings yet
ML Question Bank 2024
2 pages
TB04 - Soft Computing Ebook PDF
100% (4)
TB04 - Soft Computing Ebook PDF
356 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
UNIT V
No ratings yet
UNIT V
11 pages
UNIT-II
No ratings yet
UNIT-II
3 pages
PREPROCESSING STEPS
No ratings yet
PREPROCESSING STEPS
1 page
ML unit I
No ratings yet
ML unit I
10 pages
FYBCOM SEM-II - Functions & Derivative
No ratings yet
FYBCOM SEM-II - Functions & Derivative
1 page
A2000M Prelim Us
No ratings yet
A2000M Prelim Us
1 page
OS Lecture 2
No ratings yet
OS Lecture 2
15 pages
AN004 - Data Analysis Techniques
No ratings yet
AN004 - Data Analysis Techniques
4 pages
Kia324p Kec
No ratings yet
Kia324p Kec
4 pages
S9LT Ig J 31 Activity Sheets 1
No ratings yet
S9LT Ig J 31 Activity Sheets 1
8 pages
Lectures On Levi Convexity of Complex Manifolds and Cohomology Vanishing Theorems
No ratings yet
Lectures On Levi Convexity of Complex Manifolds and Cohomology Vanishing Theorems
114 pages
ACI 126.3R-99 Guide To Recommended Format For Concrete in Materials Property Database
No ratings yet
ACI 126.3R-99 Guide To Recommended Format For Concrete in Materials Property Database
50 pages
A2 Chem Repeated Ques
No ratings yet
A2 Chem Repeated Ques
34 pages
Thermo 1 Quality Work No.1
No ratings yet
Thermo 1 Quality Work No.1
16 pages
Project 1 - Tic Tac Toe (1)
No ratings yet
Project 1 - Tic Tac Toe (1)
2 pages
Build A Cremation Cask
100% (1)
Build A Cremation Cask
7 pages
PowerMill Robot Training Manual
No ratings yet
PowerMill Robot Training Manual
110 pages
Indefinite Integration (Practice Question)
No ratings yet
Indefinite Integration (Practice Question)
23 pages
KFS FM-200 ECS Modular Balanced Design Manual
100% (2)
KFS FM-200 ECS Modular Balanced Design Manual
110 pages
Determination of The Tooth Mobility Change During The Orthodontic Tooth Movement
No ratings yet
Determination of The Tooth Mobility Change During The Orthodontic Tooth Movement
5 pages
Leitch Routerworks Manual
100% (1)
Leitch Routerworks Manual
138 pages
javascript
No ratings yet
javascript
9 pages
Grundfos Submersible Wastewater Pump - With Grinder - SEG
No ratings yet
Grundfos Submersible Wastewater Pump - With Grinder - SEG
16 pages
Sybca
No ratings yet
Sybca
1 page
YouTube Clone Aman_090122
No ratings yet
YouTube Clone Aman_090122
43 pages
Assignment-2 1. Write A Java Program To Show That Private Member of A Super Class Cannot Be Accessed From Derivedclasses
No ratings yet
Assignment-2 1. Write A Java Program To Show That Private Member of A Super Class Cannot Be Accessed From Derivedclasses
19 pages
Bomba Hidraulica Sauer Sustran Serie 90
100% (1)
Bomba Hidraulica Sauer Sustran Serie 90
88 pages
Problem A. Corona Virus Testing: Input
No ratings yet
Problem A. Corona Virus Testing: Input
18 pages
bk7 21
No ratings yet
bk7 21
27 pages
90210-1279DEA - Laser Tracking Function Adaptive Function (E Series)
No ratings yet
90210-1279DEA - Laser Tracking Function Adaptive Function (E Series)
104 pages
1974008731
No ratings yet
1974008731
95 pages
Mechanical and durability performance of self-compacting concrete with waste ceramic tile as a replacement for natural river sand
No ratings yet
Mechanical and durability performance of self-compacting concrete with waste ceramic tile as a replacement for natural river sand
7 pages
Bhaktavatsalam Vidyashram Cs 2
No ratings yet
Bhaktavatsalam Vidyashram Cs 2
17 pages