TSEGIII AI
TSEGIII AI
COLLEGE OF COMPUTING
DEPARTEMENT INFORMATION SCIENCE
ARTIFICIAL INTELLIGENCE
INDIVIDUAL ASSIGNMENT
SUBMITTED TO.MR.DIBABA
1. LINEAR REGRESSION
Linear regression is a specific type of regression analysis that you use when you expect a
clear, straight-line relationship between your independent and dependent variables. This is
where the term “linear” in linear regression comes from. You describe the straight line by an
equation: Y = aX + b.
Simple linear regression involves a single independent variable and a single dependent variable.
The relationship between these variables is modeled using a straight line, represented by the
equation ( Y = a + bX ), where ( Y ) is the dependent variable, ( X ) is the independent variable, (
a ) is the intercept, and ( b ) is the slope.
Example
Predicting a person's blood pressure based on the number of hours they exercise per week.
Multiple Linear Regression
Advantages Disadvantages
Linear Regression is simple to implement On the other hand in linear regression technique
and easier to interpret the output outliers can have huge effects on the regression
coefficients. and boundaries are linear in this technique.
Nonlinear regression works by fitting a nonlinear model to a set of data points. Unlike linear
regression, where the best-fit line is used, nonlinear regression may use curves to describe
relationships. Techniques such as polynomial regression or neural networks can be employed to
capture the intrinsic structure of data. The goal is to minimize the difference between predicted
and actual outcomes, often using optimization algorithms.
Polynomial Regression. Polynomial regression fits a nonlinear curve to the data by using
polynomial equations, allowing for modeling of relationships that curve upwards or
downwards. It is useful for capturing trends in data beyond linear relationships.
Logistic Regression. Despite its name, logistic regression is used for binary
classification problems. It models the probability of an event occurring and is useful in
scenarios where the outcome is categorical, such as yes/no decisions.
Exponential Regression. This type models situations where growth accelerates over time,
such as populations and interest calculations. It describes relationships that increase or
decrease at a rate proportional to their current value.
Power Regression. Power regression is suited for data exhibiting a polynomial-like
relationship. It is often used in scientific applications where phenomena behave according
to power laws, such as in physics and biology.
Generalized Additive Models (GAM). GAMs extend linear models by allowing the
prediction of the response variable based on smooth functions of predictor variables. This
flexibility makes them suitable for complex datasets with non-linear patterns.
Support Vector Regression (SVR). SVR uses principles from support vector machines to
find a function that deviates from actual data points within a specified threshold, making it
effective for complex data patterns.
Decision Trees. Decision tree algorithms split data into subsets based on feature values,
making decisions at each branch. They are intuitive and good at capturing nonlinear
relationships in data.
Artificial Neural Networks (ANN). ANNs mimic the human brain’s structure, learning to
model complex relationships through interconnected nodes. They are powerful for
handling large datasets with nonlinear functions.
3. POLYNOMIAL REGRESSION
Polynomial regression is a type of regression analysis used in statistics and machine learning to
model the relationship between a dependent variable ( y ) and an independent variable ( x ) as an
( n )-degree polynomial. Unlike simple linear regression, which models the relationship as a
straight line, polynomial regression can capture non-linear patterns in the data by fitting a
polynomial equation12.
How Polynomial Regression Works
The general form of a polynomial regression equation of degree ( n ) is: [ y = \beta_0 + \beta_1 x
+ \beta_2 x^2 + \ldots + \beta_n x^n + \epsilon ] where:
( y ) is the dependent variable.
( x ) is the independent variable.
( \beta_0, \beta_1, \ldots, \beta_n ) are the coefficients of the polynomial terms
( \epsilon ) represents the error term.
There are several types of regression techniques, each suited for different types of data and
relationships:
Linear Regression: Assumes a linear relationship between the independent and dependent
variables. It tries to fit data with the best hyperplane that goes through the points.
Polynomial Regression: Extends linear regression by adding polynomial terms to capture more
complex relationships.
Support Vector Regression (SVR): Based on the support vector machine algorithm, it finds a
hyperplane that minimizes the sum of the squared residuals between the predicted and actual
values.
Decision Tree Regression: Builds a decision tree to predict the target value, where each node
represents a decision, and each branch represents the outcome.
Characteristics of Regression
Continuous Target Variable: Regression deals with predicting continuous target variables that
represent numerical values.
Error Measurement: Models are evaluated based on their ability to minimize the error between
the predicted and actual values.
Model Complexity: Ranges from simple linear models to more complex nonlinear models.
Overfitting and Underfitting: Regression models are susceptible to both overfitting and
underfitting.
Interpretability: Varies depending on the algorithm used; simple linear models are highly
interpretable, while more complex models may be harder to interpret.
Random Forest Regression: An ensemble method that combines multiple decision trees to
predict the target value.
Ridge Regression: A type of linear regression used to prevent overfitting by adding a penalty
term to the loss function.
Lasso Regression: Similar to ridge regression but forces some weights to zero, effectively
performing feature selection.
5. BAYESIAN LINEAR REGRESSION
Bayesian regression is a type of linear regression that uses Bayesian statistics to estimate the
unknown parameters of a model. It uses Bayes’ theorem to estimate the likelihood of a set of
parameters given observed data. The goal of Bayesian regression is to find the best estimate of
the parameters of a linear model that describes the relationship between the independent and
the dependent variables.The main difference between traditional linear regression and
Bayesian regression is the underlying assumption regarding the data-generating process.
Traditional linear regression assumes that data follows a Gaussian or normal distribution, while
Bayesian regression has stronger assumptions about the nature of the data and puts a prior
probability distribution on the parameters. Bayesian regression also enables more flexibility as
it allows for additional parameters or prior distributions, and can be used to construct an
arbitrarily complex model that explicitly expresses prior beliefs about the data. Additionally,
Bayesian regression provides more accurate predictive measures from fewer data points and is
able to construct estimates for uncertainty around the estimates. On the other hand, traditional
linear regressions are easier to implement and generally faster with simpler models and can
provide good results when the assumptions about the data are valid.
There are several reasons why Bayesian regression is useful over other regression techniques.
Some of them are as follows:
Bayesian regression also uses the prior belief about the parameters in the analysis. which
makes it useful when there is limited data available and the prior knowledge are relevant.
Bayesian regression provides a natural way to measure the uncertainty in the estimation of
regression parameters by generating the posterior distribution, which captures the uncertainty
in the parameter values, as opposed to the single point estimate that is produced by standard
regression techniques.
Bayesian regression facilitates model selection and comparison by calculating the posterior
probabilities of different models.