GRP Project DT
GRP Project DT
Experiment: 5
Student Names :
1. ABHISHEK CHOUDHARY UID’S: 23BAI70030
2. RISHI JAIN 23BAI70569
3. JATIN CHADDA 23BAI70041
4. KAVYA JAIN 23BAI70137
5. GAUTAM KUMAR 23BAI70207
6. MAYANK GUPTA 23BAI70292
Compare model : The primary objective of model comparison and selection is definitely
better performance of the machine learning software/solution. The objective is to narrow
down on the best algorithms that suit both the data and the business requirements.
Handling of outliers : To handle outliers effectively, analysts should identify them through
visualization or statistical methods, evaluate their impact on analysis, and apply
appropriate techniques like trimming, transformation, or exclusion to mitigate their
influence.
Building Models : The ML model development involves data acquisition from multiple
trusted sources, data processing to make suitable for building the model, choose algorithm
to build the model, build model, compute performance metrics and choose best
performing model. A building model is either a physical (real) or virtual (computer)
representation of a building. Very often, the physical model is smaller than the original
(scale model). Architectural model of an orthodox church building.
Model performance (PCA): Principal Component Analysis (PCA) is one of the most
commonly used unsupervised machine learning algorithms across a variety of
applications: exploratory data analysis, dimensionality reduction, information
compression, data de-noising, and plenty more.
University Institute of Engineering
Department of Computer Science & Engineering
4. Code :
1. Input:
Output :
2. Input:
Output:
3.Input :
4.Input:
University Institute of Engineering
Department of Computer Science & Engineering
Output:
University Institute of Engineering
Department of Computer Science & Engineering
5.Input:
Output:
University Institute of Engineering
Department of Computer Science & Engineering
6.Input:
Output:
University Institute of Engineering
Department of Computer Science & Engineering
7.Input:
Output:
University Institute of Engineering
Department of Computer Science & Engineering
8.Input:
9.Input:
University Institute of Engineering
Department of Computer Science & Engineering
Output:
University Institute of Engineering
Department of Computer Science & Engineering
10.Input:
11.Input:
University Institute of Engineering
Department of Computer Science & Engineering
Output:
12.Input:
University Institute of Engineering
Department of Computer Science & Engineering
Output:
13.Input:
14.Input:
Output:
University Institute of Engineering
Department of Computer Science & Engineering
University Institute of Engineering
Department of Computer Science & Engineering
STEPS :
Data Collection. Machine learning requires training data, a lot of it. ...
Choose a Model / Algorithm. The third step consists of selecting the right model. ...
Make Predictions.
The first step in model evaluation is to prepare your data. Split your dataset into training and
test sets using the train_test_split function from the scikit-learn library. This ensures that we
have separate data for training and evaluating our model. Now, it's time to evaluate our model
on the test set.
Machine learning is usually divided into two main types. In the predictive or supervised
learning approach, the goal is to learn a mapping from inputs x to outputs y. Given a given a
labeled set of input-output pairs
We measure a feature's importance by calculating the increase of the model's prediction error
after perturbing the feature. A feature is “important” if perturbing its values increases the model
error, because the model relied on the feature for the prediction.
University Institute of Engineering
Department of Computer Science & Engineering
Classification models have various evaluation metrics to gauge the model's performance.
Commonly used metrics are Accuracy, Precision, Recall, F1 Score, Log loss, etc. It is worth
noting that not all metrics can be used for all situations.
Good accuracy in machine learning is subjective. But in our opinion, anything greater than 70%
is a great model performance. In fact, an accuracy measure of anything between 70%-90% is
not only ideal, it's realistic. This is also consistent with industry standards.
1. A statistical measure that determine the strength of the relationship between the
one dependent variable (y) and other independent variables (x1, x2, x3......)
2. This is done to gain information about one through knowing values of the others.
5. Ma’am taught us how to compare models , save model , train model etc. .
Evaluation Grid: