0% found this document useful (0 votes)

132 views17 pages

U-4_IML

Uploaded by

tejaswim1070

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

132 views17 pages

U-4_IML

Uploaded by

tejaswim1070

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

IML UNIT-4 QUESTIONS

1. What is regression analysis? What are the assumptions and challenges in

Regression Analysis? Explain two main problems in regression analysis.

Ans:

Regression Analysis:

Regression Analysis is a supervised learning analysis.

It is used for predictive data or quantitative or numerical data.

In R Programming Language Regression Analysis is a statistical model which gives the

relationship between the dependent variables and independent variables.

Assumptions in Regression Analysis:

The assumptions in regression analysis are conditions that must be met for the model to provide
reliable estimates and valid inferences.

• Simple Linear Regression: In this simple linear regression there is only one dependent
and one independent variable. This linear regression model only one predictor.
▪ The mathematical equation for the simple linear regression model is
shown below.
▪ y=ax+b
o where y is a dependent variable
o x is a independent variable
o a, b are the regression coefficients
• Polynomial Regression: It analysis is a non-linear regression analysis. Polynomial
regression analysis helps for the flexible curve fitting of the data, involves the fitting of
polynomial equation of the data.
▪ The mathematical expression for the polynomail regression analysis is
shown below.
▪ y=a0+a1x+a2x^2+...........+anx^n

o where y is dependent variable

o x is independent variable
o a0,a1,a2 are the coefficients of independent variable.
• Exponential Regression: It is a non-linear type of regression. Exponential regression
can be expressed in two ways. Exponential regression can be used in finance, biology,
physics etc fields.
▪ The mathematical expression for the exponential regression is:
▪ y=ae^(bx)
o where y is dependent variable
o x is independent variable
o a, b are the regression coefficients.

Challenges in Regression Analysis:

1. Model Overfitting or Underfitting: An overly complex model may fit the training
data too closely (overfitting) and fail to generalize, while a simple model may miss
patterns (underfitting).

2. Outliers and Influential Points: Outliers can heavily skew results, especially in small
datasets, affecting the accuracy of the model.

3. Missing Data: Missing values can bias the model or reduce the power of analysis if not
handled correctly.

4. Nonlinear Relationships: If relationships are nonlinear, linear regression might not

capture them effectively, necessitating transformations or alternative models.

TWO MAIN PROBLEM IN REGRESSION

Two main problems in regression analysis are multicollinearity and heteroscedasticity

▪ Multicollinearity: It happens when independent variables are strongly correlated,

making it difficult to assess the unique effect of each on the dependent variable. This
overlap in variance causes inflated standard errors and can make regression coefficients
unstable, potentially misleading analysis.
▪ Heteroscedasticity:
Heteroscedasticity occurs when the error variance is not constant across values of the
independent variables, violating the assumption of equal error variance.
2.What is Linear Regression? Explain the concept of Linear Regression with
an example. How to improve accuracy of the linear regression model?
Mention the Applications, Advantages and disadvantages of Linear
Regression.

Ans:

Linear regression:

Linear regression is a type of supervised machine learning algorithm that computes the linear
relationship between the dependent variable and one or more independent features by fitting a
linear equation to observed data.

Concept of Linear Regression:

Linear regression is a statistical method that models the relationship between two variables by
fitting a straight line to the data. This line, known as the "line of best fit," is used to make
predictions.

For example, imagine you want to predict someone’s monthly expenses based on their income.
Using linear regression, you plot income on the x-axis and expenses on the y-axis. The
regression model will calculate the best-fitting line through these points. With this line, you
can predict a person’s expenses for a given income.

The line has an equation of the form:

y=mx+by=mx+b
where:
• yy is the predicted value (e.g., expenses),
• xx is the input variable (e.g., income),
• mm is the slope of the line (how much yy changes with each unit increase in xx),
• bb is the y-intercept (the value of yy when xx is 0).
Applications:

1. Risk Assessment in Finance: Used to estimate the relationship between financial

indicators and stock prices, aiding in risk management.

2. Medical Predictive Analysis: Assists in predicting patient outcomes by modeling

relationships between health indicators (e.g., age, weight) and disease progression.

3. Real Estate Price Estimation: Helps in estimating property prices by analyzing

variables such as location, area, and amenities.

4. Market Trend Analysis: Used in economics to understand and predict trends in

consumer spending, GDP, and employment rates.

5. Manufacturing Quality Control: Helps detect correlations between production

variables and quality measures, improving product consistency.

ADVANTAGES:

• Linear regression is computationally efficient and can handle large datasets

effectively

• Linear regression is relatively robust to outliers compared to other machine learning

algorithms. Outliers may have a smaller impact on the overall model performance.

• Linear regression often serves as a good baseline model for comparison with more
complex machine learning algorithms.

• Linear regression is a well-established algorithm with a rich history and is widely

available in various machine learning libraries and software packages.
DISADVANTAGES:

• Linear regression assumes a linear relationship between the dependent and

independent variables. If the relationship is not linear, the model may not perform
well.

• Linear regression is sensitive to multicollinearity, which occurs when there is a high

correlation between independent variables. Multicollinearity can inflate the variance
of the coefficients and lead to unstable model predictions.

• Linear regression assumes that the features are already in a suitable form for the
model. Feature engineering may be required to transform features into a format that
can be effectively used by the model.
3.With relevant examples In detail explain about Multiple Linear
Regression. Write the basic formula and calculation procedure for Multiple
Linear Regression. List out the difference between Linear Regression and
Multiple Regression. Mention the Applications, Advantages and
disadvantages of Multiple Linear Regression.

Ans:

Multiple Linear Regression:

Multiple linear regression is a style of predictive analysis that is frequently used. You can
comprehend the relationship between such a continuous dependent variable and two or more
independent variables using this kind of analysis.

Multiple regression analysis allows for the simultaneous control of several factors that affect
the dependent variable. The link between independent variables and dependent variables can
be examined using regression analysis.

Let k stand for the quantity of variables denoted by the letters x1, x2, x3… xk.

To use this strategy, we must suppose that we have k independent variables that we may set.
These variables will then probabilistically decide the result Y.

Y = β0 + β1x1 + β2x2 + · · · + βkxk + ε

Example:

Predicting House Prices

Imagine you are trying to predict the price of a house (yyy) based on:

1. Size (in square feet, x1x_1x1)

2. Number of bedrooms (x2x_2x2)
3. Distance to the city center (in miles, x3x_3x3)
The model might look like this:

Price =50,000+200⋅(Size)−5,000⋅(Distance)+10,000⋅(Bedrooms)

If a house is 1500 sq. ft., 3 bedrooms, and 10 miles from the city:

Price = 50,000+200⋅1500−5000⋅10+10,000⋅3=305,00

Difference Between Linear and Multiple Regression:

Multiple linear regression is preferable than basic linear regression when predicting the result
of a complex process.The relationship between two variables in straightforward relationships
can be precisely captured by a straightforward linear regression. However, multiple linear
regression can identify more intricate interactions that demand deeper analysis.

Multiple independent variables are used in a multiple regression model. It can match curved
and non-linear connections since it is not constrained by the same issues as the simple
regression equation. The uses of multiple linear regression are as follows.

• Control and planning.

• Forecasting or prediction

It can be fascinating and helpful to estimate relationships between variables. The multiple
regression model evaluates relationships between variables in terms of their capacity to forecast
the value of the dependent variable, just like all other regression models do.

Applications:

• Economics and Finance: Modeling stock market trends, demand-supply

curves, and price elasticity.
• Engineering and Physical Sciences: Capturing complex relationships in
physics (e.g., projectile motion) or material behavior.
• Medical Studies: Predicting disease progression or drug effectiveness based on
various parameters.
• Agriculture: Analyzing crop yields based on environmental factors.
Advantages:

• Captures Non-Linearity
• Easy to Implement
• Flexible and Interpretable
• Enhances Prediction Accuracy

Limitations:

• Overfitting
• Sensitivity to Outliers
• Lack of Extrapolation Power
• Computational Complexity
• Interpretability Issues
4.How does a Polynomial Regression work? Explain polynomial regression
with real life example. How to overcome the problem of Over fitting and
under fitting in Polynomial Regression? Mention the Applications,
Advantages and Limitations of Polynomial Regression.
Ans:
Polynomial Regression:
It is a form of linear regression in which the relationship between the independent variable x
and dependent variable y is modelled as an nth-degree polynomial. Polynomial regression fits
a nonlinear relationship between the value of x and the corresponding conditional mean of y,
denoted E(y | x).

Polynomial regression is a type of regression analysis used in statistics and machine

learning when the relationship between the independent variable (input) and the
dependent variable (output) is not linear.

Polynomial Regression work:

If we observe closely then we will realize that to evolve from linear regression to polynomial
regression. We are just supposed to add the higher-order terms of the dependent features in the
feature space. This is sometimes also known as feature engineering but not exactly.

When the relationship is non-linear, a polynomial regression model introduces higher-degree

polynomial terms.

The general form of a polynomial regression equation of degree n is:

• y is the dependent variable.

• x is the independent variable.
• 0,1,…n are the coefficients of the polynomial terms.
• n is the degree of the polynomial.
• represents the error term.
The basic goal of regression analysis is to model the expected value of a dependent variable
y in terms of the value of an independent variable x. In simple linear regression, we used
the following equation – y = a +bx+e
Here y is a dependent variable, a is the y-intercept, b is the slope and e is the error rate.In
general, we can model it for the nth value.

Polynomial Regression Real-Life Example:

Let’s consider a real-life example to illustrate the application of polynomial
regression Suppose you are working in the field of finance, and you are analyzing the
relationship between the years of experience (in years) an employee has and their
corresponding salary (in dollars). You suspect that the relationship might not be linear and
that higher degrees of the polynomial might better capture the salary progression over time.

Now, let’s apply polynomial regression to model the relationship between years of
experience and salary. We’ll use a quadratic polynomial (degree 2) for this example.

The quadratic polynomial regression equation is:

Salary= 0 + 1×Experience+2×Experience^2+….

Now, to find the coefficients that minimize the difference between the predicted salaries and
the actual salaries in the dataset we can use a method of least squares. The objective is to
minimize the sum of squared differences between the predicted values and the actual values.

Over fitting and under fitting in Polynomial Regression:

Polynomial regression often leads to overfitting when the model's complexity increases to fit
the training data too closely, resulting in poor performance on new data. To address this,
regularization techniques like Ridge and Lasso regression are used. These methods penalize
large model weights, reducing overfitting by discouraging overly complex models. Ridge
regression minimizes the sum of squared weights, while Lasso regression encourages sparsity
by driving some weights to zero, simplifying the model. Regularization ensures a balance
between fitting the training data and maintaining generalization to unseen data.

Application:

The reason behind the vast use cases of the polynomial regression is that approximately all
of the real-world data is non-linear in nature and hence when we fit a non-linear model on
the data or a curvilinear regression line then the results that we obtain are far better than what
we can achieve with the standard linear regression. Some of the use cases of the Polynomial
regression are as stated below:

• The growth rate of tissues.

• Progression of disease epidemics

• Distribution of carbon isotopes in lake sediments

Advantages:

• A broad range of functions can be fit under it.

• Polynomial basically fits a wide range of curvatures.

• Polynomial provides the best approximation of the relationship between dependent

and independent variables.

Disadvantages:

• These are too sensitive to outliers.

• The presence of one or two outliers in the data can seriously affect the results of
nonlinear analysis.

• In addition, there are unfortunately fewer model validation tools for the detection of
outliers in nonlinear regression than there are for linear regression.
5.What are odds? Why is it used in logistic regression? With an example,
describe How we can solve multiclass classification problem using Logistic
Regression? Discuss assumptions in logistic regression. Mention the
Applications, Advantages and Limitations of Logistic Regression.
Ans:

Odds:
Odds represent the ratio of the probability of an event occurring to the probability of it not
occurring. Mathematically:
Odds=𝑃(event)/1−𝑃(event)
In logistic regression, odds are used because the model predicts probabilities of outcomes, and
the log of the odds (logit) provides a linear relationship between the predictors and the target.
This makes it easier to estimate parameters and interpret the relationship between variables.

Solving Multiclass Classification with Logistic Regression:

Logistic regression is inherently a binary classifier, but multiclass problems can be solved using
techniques like:

• One-vs-Rest (OvR): The problem is divided into multiple binary classification tasks,
where one class is treated as the positive class and all others as negative. For example,
to classify handwritten digits (0-9), 10 binary classifiers are trained, one for each digit.
• One-vs-One (OvO): A binary classifier is trained for every pair of classes. For
example, for classifying digits, classifiers for (0 vs 1), (0 vs 2), (1 vs 2), and so on are
created, resulting in (10/2)=45 classifiers. The final prediction is based on a voting
mechanism.

In logistic regression, multiclass classification is handled using techniques like One-vs-Rest

(OvR) or Softmax Regression:
Suppose we want to predict the branch of engineering a student will choose based on their
scores in Physics, Chemistry, and Mathematics. The possible branches are:
1: Computer Science
2: Electrical Engineering
3: Mechanical Engineering
Steps:

Using One-vs-Rest (OvR), logistic regression creates separate binary classifiers:

Classifier 1: "Computer Science" vs. "Not Computer Science"
Classifier 2: "Electrical Engineering" vs. "Not Electrical Engineering"
Classifier 3: "Mechanical Engineering" vs. "Not Mechanical Engineering"
Each classifier predicts probabilities, and the branch with the highest probability is selected as
the final prediction.
Alternatively, Softmax Regression computes probabilities directly for all classes in a single
model, assigning the class with the highest probability.

Assumptions:
❖ Independent observations: Each observation is independent of the other. meaning
there is no correlation between any input variables.
❖ Binary dependent variables: It takes the assumption that the dependent variable must
be binary or dichotomous, meaning it can take only two values.
❖ Linearity relationship between independent variables and log odds: The
relationship between the independent variables and the log odds of the dependent
variable should be linear.
❖ No outliers: There should be no outliers in the dataset.
❖ Large sample size: The sample size is sufficiently large

Applications:
❖ Healthcare: Predicting the presence of a disease (e.g., diabetes or cancer detection).
❖ Marketing: Classifying customer responses (e.g., whether they will purchase a product
or not).
❖ Finance: Assessing credit risk or likelihood of loan default.
❖ Education: Predicting student performance or dropout risk.
❖ Social Science: Modeling voting behavior or survey responses.
Advantages:
• Simplicity and Interpretability
• Efficiency
• Probability Output
• Works Well with Linearly Separable Data
• Less Prone to Overfitting

Disadvantages:
• Linear Decision Boundaries
• Sensitive to Outliers
• Assumes Independence of Features
• Limited to Binary and Multiclass Classification
• Requires Large Datasets
6.Write the Difference Between Probability and Likelihood. In detail explain
the concept of Maximum Likelihood Estimation with an example.Mention
the Applications, Advantages and disadvantages of Maximum likelihood
estimation.
Ans:

Difference Between Probability and Likelihood:

Although the working and intuition of both probability and likelihood appear to be the same,
there is a slight difference, here the possibility is a function that defines or tells us how accurate
the particular data point is valuable and contributes to the final algorithm in data distribution
and how likely is to the machine learning algorithm.

Whereas probability, in simple words is a term that describes the chance of some event or thing
happening concerning other circumstances or conditions, mostly known as conditional
probability.

Also, the sum of all the probabilities associated with a particular problem is one and can not
exceed it, whereas the likelihood can be greater than one.

Maximum Likelihood Estimation:

Maximum Likelihood Estimation (MLE) aims to find the parameters that maximize the
likelihood of observing the given data. In a classification problem where the independent
variable is student marks and the target is whether a student gets placed ("Yes" or "No"), MLE
estimates the probabilities for each data point based on the target conditions. The algorithm
then plots the data points and identifies the best-fit line that separates the two classes (placed
vs. not placed). This process continues for several iterations (epochs) to refine the line. Once
the best-fit line is found, it is used to classify new data points by plotting them on the graph.

Example:
The maximum likelihood estimation is a base of some machine learning and deep learning
approaches used for classification problems. One example is logistic regression, where the
algorithm is used to classify the data point using the best-fit line on the graph.
As shown in the above image, all the data observations are plotted in a two-dimensional
diagram where the X-axis represents the independent column or the training data, and the y-
axis represents the target variable. The line is drawn to separate both data observations,
positives and negatives. According to the algorithm, the observations that fall above the line
are considered positive, and data points below the line are regarded as negative data points.

Applications:
• Machine Learning: MLE is used to estimate the parameters of various models,
including logistic regression, Naive Bayes, and hidden Markov models.
• Econometrics: It helps in estimating the parameters of economic models, such as
demand and supply functions.
• Reliability Engineering: Used to estimate the failure rates of systems and components.
• Signal Processing: MLE is applied to estimate parameters in signal noise models, such
as in radar or communication systems.

Advantages:
• Asymptotically Efficient
• Flexibility
• Consistent
• Statistical Inference
• Broad Applicability
Disadvantages:
• Computational Complexity
• Sensitivity to Outliers
• Requires Large Sample Sizes
• Model Assumptions
• Local Maxima

Complete Download (Ebook PDF) Mind On Statistics: Australian & New Zealand 2nd PDF All Chapters
100% (4)
Complete Download (Ebook PDF) Mind On Statistics: Australian & New Zealand 2nd PDF All Chapters
41 pages
Linear Regression
No ratings yet
Linear Regression
16 pages
Data Mining MCQ
78% (147)
Data Mining MCQ
34 pages
ML Algorithm
No ratings yet
ML Algorithm
4 pages
LINEAR REGRESSION MODEL 1
No ratings yet
LINEAR REGRESSION MODEL 1
23 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Data Analytics and Visualization Unit-II
No ratings yet
Data Analytics and Visualization Unit-II
23 pages
Hanan
No ratings yet
Hanan
9 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
IDS UNIT 5 Linear Regression
No ratings yet
IDS UNIT 5 Linear Regression
27 pages
Linear regression for machine learning
No ratings yet
Linear regression for machine learning
9 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
Linear Regression Analysis
No ratings yet
Linear Regression Analysis
4 pages
Module 4
No ratings yet
Module 4
41 pages
Data Science
100% (1)
Data Science
14 pages
MOD3_EDA
No ratings yet
MOD3_EDA
16 pages
Unit-2
No ratings yet
Unit-2
18 pages
-18-Linear Regression
No ratings yet
-18-Linear Regression
29 pages
Linear Regression - FDS
No ratings yet
Linear Regression - FDS
18 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
UNIT-2 NOTES
No ratings yet
UNIT-2 NOTES
30 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Complete Linear Regression Algorithm
No ratings yet
Complete Linear Regression Algorithm
4 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
4 pages
OE-ML unit -3
No ratings yet
OE-ML unit -3
29 pages
Ml Module3 Regression
No ratings yet
Ml Module3 Regression
51 pages
He Images Outline the Steps to Solve a Supervised Learning Problem
No ratings yet
He Images Outline the Steps to Solve a Supervised Learning Problem
24 pages
Combinepdf
No ratings yet
Combinepdf
8 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
unit-3 part 2 DA
No ratings yet
unit-3 part 2 DA
20 pages
Linear Regression
No ratings yet
Linear Regression
3 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
Unit 2
No ratings yet
Unit 2
19 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
09. linear regression
No ratings yet
09. linear regression
46 pages
DIMPAS_BSCPE_2-7_ASSIGNMENT_NO.9
No ratings yet
DIMPAS_BSCPE_2-7_ASSIGNMENT_NO.9
17 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Solving-One-Variable-Linear-Equations
No ratings yet
Solving-One-Variable-Linear-Equations
10 pages
2.1 Regression Analysis
No ratings yet
2.1 Regression Analysis
28 pages
ML Using Python Unit3 pdf
No ratings yet
ML Using Python Unit3 pdf
8 pages
unit_2questionbank-1
No ratings yet
unit_2questionbank-1
38 pages
4 ML
No ratings yet
4 ML
41 pages
LR_1751142062
No ratings yet
LR_1751142062
10 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
12000121126_priyajitdutta_ML
No ratings yet
12000121126_priyajitdutta_ML
9 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
5_AML Lecture 5_Linear regression
No ratings yet
5_AML Lecture 5_Linear regression
56 pages
Regression
No ratings yet
Regression
14 pages
What Are Linear Models in Machine Learning[1].Docx (Unit3 Ml)
No ratings yet
What Are Linear Models in Machine Learning[1].Docx (Unit3 Ml)
60 pages
Unit - II_DA
No ratings yet
Unit - II_DA
22 pages
DA Notes 3
No ratings yet
DA Notes 3
12 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
1linear Regression
No ratings yet
1linear Regression
12 pages
Regression Analysis: A Journey from Simple to Complex
From Everand
Regression Analysis: A Journey from Simple to Complex
Pasquale De Marco
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Research Into The Conditional Task Scheduling Problem
No ratings yet
Research Into The Conditional Task Scheduling Problem
9 pages
04 Cost Behavior Analysis 2022
No ratings yet
04 Cost Behavior Analysis 2022
8 pages
Statistics II - Formula Sheet: Unit 1
No ratings yet
Statistics II - Formula Sheet: Unit 1
2 pages
MTE - Question Paper
No ratings yet
MTE - Question Paper
1 page
Cost Analysis For Bridge and Culverts
No ratings yet
Cost Analysis For Bridge and Culverts
14 pages
Exercise
No ratings yet
Exercise
10 pages
IGD Presentation
No ratings yet
IGD Presentation
13 pages
Interference Score Stroop Test-Main
No ratings yet
Interference Score Stroop Test-Main
13 pages
Chapter 21: Multidimensional Scaling and Conjoint Analysis: Advance Marketing Research
No ratings yet
Chapter 21: Multidimensional Scaling and Conjoint Analysis: Advance Marketing Research
58 pages
Business Statistics MBA IB (2024-27)
No ratings yet
Business Statistics MBA IB (2024-27)
6 pages
PM - Sept 22-June 23 - Final Updated LW 28 - 4
No ratings yet
PM - Sept 22-June 23 - Final Updated LW 28 - 4
19 pages
CH06 Wooldridge 7e PPT 2pp
No ratings yet
CH06 Wooldridge 7e PPT 2pp
17 pages
Accounting Hounours 2nd Year Syllabus
No ratings yet
Accounting Hounours 2nd Year Syllabus
11 pages
Nonlinear Example How To
No ratings yet
Nonlinear Example How To
5 pages
advanced statistical methods
No ratings yet
advanced statistical methods
4 pages
Does Earning Management Actions, Intellectual Capital, And Efficiency Ratios Affect the Performance of Service Sector Companies in Indonesia
No ratings yet
Does Earning Management Actions, Intellectual Capital, And Efficiency Ratios Affect the Performance of Service Sector Companies in Indonesia
9 pages
Runoff
No ratings yet
Runoff
49 pages
Research Paper The Effect of Athletic Identity On Social Behavior and Aggression in School Basketball Games
No ratings yet
Research Paper The Effect of Athletic Identity On Social Behavior and Aggression in School Basketball Games
4 pages
1986 Tropical Summer Index A Study of TC of Indian Subjects
100% (1)
1986 Tropical Summer Index A Study of TC of Indian Subjects
14 pages
1 - Does Business DiversificationAffect Performance?
No ratings yet
1 - Does Business DiversificationAffect Performance?
9 pages
Full Paper Organizational Stress Case Study Partnership Businesses in The Municipality
No ratings yet
Full Paper Organizational Stress Case Study Partnership Businesses in The Municipality
13 pages
Weston& Gore Jr. 2006 A Brief Guide To Structural Equation Modelingl
No ratings yet
Weston& Gore Jr. 2006 A Brief Guide To Structural Equation Modelingl
34 pages
Problem Set 5
No ratings yet
Problem Set 5
5 pages
Basic Engineering Data Collection and Analysis
100% (2)
Basic Engineering Data Collection and Analysis
851 pages
CS3352 FDS Notes - 03 - by WWW - Notesfree.in
No ratings yet
CS3352 FDS Notes - 03 - by WWW - Notesfree.in
138 pages
An Empirical Analysis of The Antecedents and Performance Consequences of Using The Moodle Platform
No ratings yet
An Empirical Analysis of The Antecedents and Performance Consequences of Using The Moodle Platform
5 pages
WP 18125
No ratings yet
WP 18125
22 pages
Statistics For A2 Biology
100% (1)
Statistics For A2 Biology
9 pages

U-4_IML

Uploaded by

U-4_IML

Uploaded by

IML UNIT-4 QUESTIONS

1. What is regression analysis? What are the assumptions and challenges in

Regression Analysis is a supervised learning analysis.

It is used for predictive data or quantitative or numerical data.

In R Programming Language Regression Analysis is a statistical model which gives the

Assumptions in Regression Analysis:

o where y is dependent variable

Challenges in Regression Analysis:

4. Nonlinear Relationships: If relationships are nonlinear, linear regression might not

TWO MAIN PROBLEM IN REGRESSION

Two main problems in regression analysis are multicollinearity and heteroscedasticity

▪ Multicollinearity: It happens when independent variables are strongly correlated,

Concept of Linear Regression:

The line has an equation of the form:

1. Risk Assessment in Finance: Used to estimate the relationship between financial

2. Medical Predictive Analysis: Assists in predicting patient outcomes by modeling

3. Real Estate Price Estimation: Helps in estimating property prices by analyzing

4. Market Trend Analysis: Used in economics to understand and predict trends in

5. Manufacturing Quality Control: Helps detect correlations between production

• Linear regression is computationally efficient and can handle large datasets

• Linear regression is relatively robust to outliers compared to other machine learning

• Linear regression is a well-established algorithm with a rich history and is widely

• Linear regression assumes a linear relationship between the dependent and

• Linear regression is sensitive to multicollinearity, which occurs when there is a high

Multiple Linear Regression:

Y = β0 + β1x1 + β2x2 + · · · + βkxk + ε

Predicting House Prices

1. Size (in square feet, x1x_1x1)

Difference Between Linear and Multiple Regression:

• Control and planning.

• Economics and Finance: Modeling stock market trends, demand-supply

Polynomial regression is a type of regression analysis used in statistics and machine

Polynomial Regression work:

When the relationship is non-linear, a polynomial regression model introduces higher-degree

The general form of a polynomial regression equation of degree n is:

• y is the dependent variable.

Polynomial Regression Real-Life Example:

The quadratic polynomial regression equation is:

Over fitting and under fitting in Polynomial Regression:

• The growth rate of tissues.

• Progression of disease epidemics

• Distribution of carbon isotopes in lake sediments

• A broad range of functions can be fit under it.

• Polynomial basically fits a wide range of curvatures.

• Polynomial provides the best approximation of the relationship between dependent

• These are too sensitive to outliers.

Solving Multiclass Classification with Logistic Regression:

In logistic regression, multiclass classification is handled using techniques like One-vs-Rest

Using One-vs-Rest (OvR), logistic regression creates separate binary classifiers:

Difference Between Probability and Likelihood:

Maximum Likelihood Estimation:

You might also like