Data Science and Applications Notes
Data Science and Applications Notes
Predictive Modeling is also extremely useful in Big Data scenarios where the
data is large, unstructured, and complex, and cannot be managed by using a
normal database management system. For example, social networks and
web logs are sources of Big Data, which if studied and analyzed carefully,
can provide you with significant insights into user behavioral patterns
What is Cross Validation, bias variance rid off, overfitting, multiple colinearity,
Generation cost function of linear regression models?
Multi collinearity -
• The inherent correlation among predictors (X variables will lead to the problem
of multi collinearity
• Not only direct correlation between X variables but also multi linear relation
would lead to the problem
• Multi collinearity leads to instability / high variance /overfitting of coefficients of
the model. Even with small change in the training data would lead to dramatic
change in the coefficients/ predictions
Overfitting -
• Model is said to be overfitting if it performs on the Training Data but performs
significantly poor in Test Data
• Reasons of Overfitting
Complex model (with large number of variables and confounding)
Hyper tuning of parameters in order to get better training accuracy
Cross Validation -
The purpose of cross–validation is to test the ability of a machine learning model
to predict new data. It is also used to flag problems like overfitting or selection bias
and gives insights on how the model will generalize to an independent dataset.
Gradient descent algorithm and its working mechanism
• Gradient Descent is used machine learning algorithms in the industry,
And yet it confounds a lot of newcomers.
• Gradient descent is an iterative optimization algorithm for finding the
local minimum of a function.
• The goal of the gradient descent algorithm is to minimize the given
function (say cost function). To achieve this goal, it performs two steps
iteratively:
1. Compute the gradient (slope), the first order derivative of the function
at that point
2. Make a step (move) in the direction opposite to the gradient, opposite
direction of slope increase from the current point by alpha times the
gradient at that point