GRP Project DT

This document describes a machine learning experiment conducted by students to develop a prediction model using linear or logistic regression. The students used Google Colab and the PyCaret library to build several models and compare their performance. They preprocessed the data, selected features, trained models on sample inputs and outputs, and evaluated the models based on accuracy and other metrics. Key steps included data collection, preparation, algorithm selection, model training, parameter tuning, and prediction. The best model was over 70% accurate, meeting industry standards. Overall, the students learned how to effectively build and evaluate machine learning models in Python.

Uploaded by

Niharika sharma

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

GRP Project DT

Uploaded by

Niharika sharma

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

University Institute of Engineering

Department of Computer Science & Engineering

Experiment: 5
Student Names :
1. ABHISHEK CHOUDHARY UID’S: 23BAI70030
2. RISHI JAIN 23BAI70569
3. JATIN CHADDA 23BAI70041
4. KAVYA JAIN 23BAI70137
5. GAUTAM KUMAR 23BAI70207
6. MAYANK GUPTA 23BAI70292

Branch: Computer Science & Engineering Section/Group: 23AML 104A

Semester: 1 Date of Performance:
Subject Name: DISRUPTIVE TECHNOLOGIES
Subject Code: 23 ECH-102

1. Aim of the practical: To Develop a prediction model based on linear/logistic regression.

2. Tool Used: Google Colaboratory and Require the PyCaret libraries

(https://pycaret.org/) .

3. Basic Concept/ Command Description:

PyCaret : PyCaret is an open-source, low-code machine learning library in Python that

automates machine learning workflows. It is an end-to-end machine learning and model
management tool that exponentially speeds up the experiment cycle and makes you more
productive.

Machine Learning : Machine learning is a method of teaching computers to learn from

data, without being explicitly programmed. Python is a popular programming language
for machine learning because it has a large number of powerful libraries and frameworks
that make it easy to implement machine learning algorithms.
University Institute of Engineering
Department of Computer Science & Engineering

Compare model : The primary objective of model comparison and selection is definitely
better performance of the machine learning software/solution. The objective is to narrow
down on the best algorithms that suit both the data and the business requirements.

Train/Test dataset : Train/Test is a method to measure the accuracy of your model. It is

called Train/Test because you split the data set into two sets: a training set and a testing
set. 80% for training, and 20% for testing. You train the model using the training set.

Normalization : Normalization in machine learning is the process of translating data into

the range [0, 1] (or any other range) or simply transforming data onto the unit sphere.
Some machine learning algorithms benefit from normalization and standardization,
particularly when Euclidean distance is used.

Transformation : Data transformation is also known as data preparation or data

preprocessing. There are lots of different names for the same thing. It makes sure that
your data is clean and ready to be used by your machine learning algorithm. Without data
transformation, your AI won't be able to make accurate predictions.

Handling of outliers : To handle outliers effectively, analysts should identify them through
visualization or statistical methods, evaluate their impact on analysis, and apply
appropriate techniques like trimming, transformation, or exclusion to mitigate their
influence.

Building Models : The ML model development involves data acquisition from multiple
trusted sources, data processing to make suitable for building the model, choose algorithm
to build the model, build model, compute performance metrics and choose best
performing model. A building model is either a physical (real) or virtual (computer)
representation of a building. Very often, the physical model is smaller than the original
(scale model). Architectural model of an orthodox church building.

Feature Selection : Feature selection is a process in machine learning to identify important

features in a dataset to improve the performance and interpretability of the model.

Model performance (PCA): Principal Component Analysis (PCA) is one of the most
commonly used unsupervised machine learning algorithms across a variety of
applications: exploratory data analysis, dimensionality reduction, information
compression, data de-noising, and plenty more.
University Institute of Engineering
Department of Computer Science & Engineering

4. Code :

1. Input:

Output :

2. Input:

Output:

3.Input :