ML 2
ML 2
Assignment-3
Linear Regression: Predicts the relationship between the dependent and independent
variables by fitting a linear equation. It’s suitable for continuous data with a linear
relationship.
Logistic Regression: Used for binary classification problems. It predicts the
probability that a given input belongs to a particular category, using a logistic
function.
Polynomial Regression: Extends linear regression by using polynomial terms. It’s
useful when the relationship between the independent and dependent variables is non-
linear.
Ridge Regression: A form of linear regression that includes a penalty term for large
coefficients, helping to avoid overfitting.
Lasso Regression: Similar to ridge regression but with a penalty that can shrink some
coefficients to zero, effectively performing feature selection.
Elastic Net Regression: A combination of ridge and lasso regression, useful for
handling multicollinearity and feature selection.
Support Vector Regression (SVR): Uses Support Vector Machine principles to fit a
model within a margin of tolerance. It's suitable for complex relationships in small
datasets.
Decision Tree Regression: A tree-based model where each decision node represents
a test on an attribute, and each leaf node represents an output value.
Random Forest Regression: An ensemble method that combines multiple decision
trees for improved accuracy and reduced overfitting.
Yes, decision trees can be used for regression. This is known as Decision Tree Regression.
Explanation:
In decision tree regression, the algorithm splits the data at nodes based on the feature
that minimizes the variance in the target variable within each split.
Unlike classification trees, where each leaf represents a class label, each leaf node in
regression trees represents a continuous value, typically the average of all values in
that node's data subset.
Decision trees for regression are particularly useful for handling complex
relationships between variables and when there are non-linear relationships.
3. What do you mean by information gain and entropy? How is it used to
build decision trees? Illustrate using an example.
1. The decision tree algorithm calculates the entropy for each possible split in the
dataset.
2. It chooses the split that maximizes the information gain, thereby reducing the
dataset’s impurity the most.
3. This process continues recursively until each node is pure (or reaches a stopping
condition).
Example: Suppose we want to classify whether a person will buy a car based on their
income. We can use entropy and information gain to determine the best split at each node,
choosing features that maximize information gain to grow the tree.
4. What are issues in decision tree learning? How are they overcome?
Overfitting: Decision trees can easily become too complex, capturing noise in the
data.
o Solution: Use techniques like pruning (removing branches that have low
importance), setting a minimum number of samples per leaf, or limiting tree
depth.
High variance: Small changes in data can result in a different tree structure.
o Solution: Use ensemble methods like Random Forest, which averages the
predictions of multiple trees to reduce variance.
Bias towards features with more levels: Decision trees tend to favor features with
more levels for splitting.
o Solution: Use feature scaling or consider alternative methods that do not rely
on this criterion.
Sensitivity to imbalance in classes: Decision trees might favor the majority class in
imbalanced datasets.
o Solution: Use resampling techniques like SMOTE or assign class weights to
balance the classes.
5. Use the following data to generate a linear regression model for annual
salary as a function of GPA and number of months worked.
To perform a multiple linear regression, where Annual Salary is predicted based on GPA and
Months Worked, follow these steps:
The model for predicting annual salary (Y) based on GPA (X1) and months worked (X2) is:
where:
Using a statistical tool (such as Python's statsmodels or sklearn), you can run a multiple
linear regression to estimate the coefficients β0, β1, and β2 .
import pandas as pd
import statsmodels.api as sm
data = {
'Annual_Salary': [20000, 24500, 23000, 25000, 20000, 22500, 27500, 19000, 24000, 28500],
'GPA': [2.8, 3.4, 3.2, 3.8, 3.2, 3.4, 4.0, 2.6, 3.2, 3.8],
'Months_Worked': [48, 24, 24, 24, 48, 36, 24, 48, 36, 12]
df = pd.DataFrame(data)
X = df[['GPA', 'Months_Worked']]
Y = df['Annual_Salary']
X = sm.add_constant(X)
print(model.summary())
β0=15000,
β1=2500,
β2=100,
Using this equation, you can predict the annual salary based on any given GPA and months
worked.
Using this equation, we can predict the annual salary for any given values of GPA and
months worked. For instance, if someone has a GPA of 3.5 and has worked for 30 months,
we can substitute these values into the equation:
Calculating this: