0% found this document useful (0 votes)

11 views

DS FINAL

This document is a laboratory manual for a Data Science course (SSCA3021) in the Department of Computer Science and Application. It outlines practical exercises covering Python basics, data structures, libraries like Pandas and NumPy, exploratory data analysis, data munging, and building predictive models using various algorithms. The manual also includes a mini project for students to apply their learned skills in a group setting.

Uploaded by

Om Balar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

DS FINAL

Uploaded by

Om Balar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

School of LABORATORY MANUAL

v
Engineering
Department of Computer Science and Application

DATA SCIENCE
(SSCA3021) | BSCIT/BCA

Name of Student:

Enrollment No.: Academic Year:

ani School of Engineering
P P Sav

This is to certify that

Mr./Ms.

of Computer Science and Engineering having Enrollment No. has completed

his/her Term work in the subject of
DATA SCIENCE

(SSCA3021).

Marks Obtained: out of

Sign of Faculty Sign of Head of Department

Date: Date: ___________________
Technology Engineering Department.
CONTENTS
SR.
NAME OF PRACTICAL PAGE NO. DATE MARK SIGN
NO.
1. Basics of Python for Data Analysis

2. Why learn Python for data analysis?

3. Python 2.7 v/s 3.4

How to install Python? Running a few
4.
simple programs in Python
5. Python libraries and data structures

6. Python Data Structures

Python Iteration and Conditional
7.
Constructs, • Python Libraries
Exploratory analysis in Python using
8.
Pandas
9. Introduction to series and data frames
Analytics of dataset- Loan Prediction
10.
Problem
11. Data Munging in Python using Pandas
Building a Predictive Model in Python
12. Logistic Regression • Decision Tree •
Random Forest
Building a Predictive Model in Python
13. Logistic Regression • Decision Tree •
Random Forest
14. Mini Project
Practical - 1 , 2 ,3, 4

Aim : Introduction to Python Environment

Basics of Python for Data Science:

Understand the differences between Python 2.7 and Python 3.4

Learn how to install Python

Write and run simple Python programs

Installing Python:

Download Python 3.10 from the official website.

Follow the installation steps for your OS.
Verify the installation by running python --version in the terminal.
Creating Your First Python Project:

Conclusion:

You have successfully installed Python and executed basic Python code.
Practical - 5, 6

Aim: Python Libraries and Data Structures

Objectives:
Learn about Python's data structures.
Understand conditional constructs and iterations.
Explore important Python libraries.

Data Structures:

List:

Definition: A list is an ordered, mutable collection of items. It can contain elements of different types.

Syntax: Lists are defined by placing elements inside square brackets [].

Dictionaries:

Definition: A dictionary is a collection of key-value pairs. Each key is unique and maps to a value.

Dictionaries are mutable.

Syntax: Dictionaries are defined using curly braces {} with key-value pairs separated by a colon:

Sets:

Definition: A set is an unordered collection of unique elements. Sets are mutable but do not allow duplicate

elements.

Syntax: Sets are defined by placing elements inside curly braces {} or using the set() function.
Tuples:

Definition: A tuple is an ordered collection of items, but unlike lists, it is immutable, meaning it cannot be

changed after creation.

Syntax: Tuples are defined by placing elements inside parentheses ().

Summary of Key Differences:

Practical - 7

Aim: Iteration and Conditional Constructs:

Iteration:
Iteration refers to the repeated execution of a block of code. Common types of iteration constructs include:

For Loop: Used to repeat a block of code a specific number of times.

While Loop: Continues to execute as long as a specified condition is true.

Conditional Constructs:
Conditional constructs allow you to execute different blocks of code based on certain conditions. The main types
include:

If Statement: Executes a block of code if a condition is true.

If-Else Statement: Provides an alternative block of code if the condition is false.

Elif Statement: Allows checking multiple conditions.

Combining Iteration and Conditionals

Key Python Libraries:

NumPy
NumPy (Numerical Python) is a powerful library for numerical operations in Python. It provides support for
large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on
these arrays.

Key Features:
N-dimensional arrays: Efficient storage and manipulation of large datasets.
Mathematical functions: Functions for element-wise operations, linear algebra, statistical operations, etc.
Broadcasting: A powerful mechanism for performing operations on arrays of different shapes.

Pandas
Pandas is a library primarily used for data manipulation and analysis. It provides data structures like Series and
DataFrame, which allow for easy handling of structured data.

Key Features:
DataFrame: Two-dimensional size-mutable, potentially heterogeneous tabular data.
Data manipulation: Easy indexing, filtering, grouping, and aggregating of data.
Handling missing data: Built-in functionality for managing missing values.
Matplotlib for plotting

Matplotlib
Matplotlib is a plotting library that provides a flexible way to create static, animated, and interactive
visualizations in Python.

Key Features:
Versatile plotting: Support for line plots, bar plots, histograms, scatter plots, and more.
Customization: Extensive options for customizing plots (titles, labels, legends, etc.).
Integration: Can be easily integrated with NumPy and Pandas for plotting data.

Combining Them
Here’s a combined example using all three libraries:
Practical - 8, 9

Aim: Exploratory Data Analysis in Python Using Pandas

Exploratory Data Analysis (EDA) is a crucial step in the data analysis process that helps summarize the main
characteristics of a dataset, often using visual methods.

Objective:
Learn to perform data exploration using Pandas, focusing on Series and DataFrames.

Import Libraries
First, you'll need to import the necessary libraries.

Load the Dataset

You can load a dataset from a CSV file or any other format supported by Pandas.

Data Overview
Get a basic understanding of the dataset.
Data Visualization
Visualizing data helps to identify patterns, trends, and anomalies.

a. Distribution of Age

b. Survival Rate by Gender

c. Box Plot for Fare by Class

Correlation Analysis
Examining relationships between numerical variables can provide insights.
Practical - 10

Aim: Loan Prediction Dataset Analytics: Perform basic operations on the Loan Prediction dataset
To perform basic analytics on a Loan Prediction dataset using Pandas, we’ll walk through the following steps:

• Import Libraries

• Load the Dataset

• Data Overview

• Data Cleaning

• Exploratory Data Analysis (EDA)

• Basic Operations and Insights

Step 1: Import Libraries

Step 2: Load the Dataset

Assuming you have a Loan Prediction dataset in CSV format, load it using Pandas. Here, we’ll use a
hypothetical URL for demonstration.
Step 3: Data Overview
Get a basic understanding of the dataset.

Step 4: Data Cleaning

Handle missing values and data types as necessary.

Step 5: Exploratory Data Analysis (EDA)

Visualize key aspects of the dataset.

a. Loan Status Distribution

b. Loan Amount Distribution

c. Correlation Analysis

Step 6: Basic Operations and Insights

Perform some basic analysis to derive insights.

a. Average Loan Amount by Gender

b. Loan Status by Education Level

c. Percentage of Loans Approved by Property Area
Practical- 11

Aim: Data Munging in Python Using Pandas

Data Munging (also called Data Wrangling) is the process of cleaning, transforming, and organizing raw data
into a format suitable for analysis. Pandas, being a powerful library, provides several functions and methods
to perform data munging efficiently.

Here’s a step-by-step guide on performing data munging using Pandas.

Steps for Data Munging

• Loading the Data

• Handling Missing Data

• Data Transformation

• Handling Duplicates

• Filtering and Sorting Data

• Feature Engineering

• Encoding Categorical Variables

• Combining DataFrames

1. Load Data

2. Handling Missing Data

Missing data is common in real-world datasets, and Pandas provides various methods to deal with it.
• Detecting Missing Data:

• Filling Missing Values: You can fill missing values with different strategies like the mean,
median, or mode.

• Dropping Rows with Missing Values: Sometimes it’s better to drop rows with too many
missing values.

3. Data Transformation: Data may need to be converted or transformed for consistency.

• Converting Data Types: You may need to change data types for analysis.

• Normalizing Numerical Data: Scaling numerical data to a common range.

4. Handling Duplicates: Duplicate data can skew analysis results, so it’s important to handle them.

• Identifying Duplicates:

• Removing Duplicates:

5. Filtering and Sorting Data: You often need to filter out unnecessary rows and sort the data for better
analysis.

• Filtering Rows:

• Sorting Data:

6. Feature Engineering: Feature engineering is the process of creating new features from the existing
data to improve the performance of models.

• Creating New Features:

7. Encoding Categorical Variables: Many machine learning algorithms require numerical input, so
converting categorical data to numerical form is often necessary.

• Label Encoding:

• One-Hot Encoding:

8. Combining data Frames: Sometimes, you’ll need to combine multiple datasets for analysis.

• Concatenating data Frames:

• Merging Data Frames:

Example Workflow: Putting It All Together
Practical - 12,13

Aim: Building a Predictive Model in Python

• Logistic Regression

• Decision Tree

• Random Forest

Logistic Regression

Logistic Regression is a linear model used for binary classification problems. It predicts the
probability of the target variable belonging to a particular class.

Steps:

• Import the necessary libraries and dataset.

• Preprocess the data (handle missing values, encode categorical variables, etc.).

• Train a Logistic Regression model.

• Evaluate the model.

Logistic Regression is commonly used for binary classification problems like spam
detection, loan default prediction, and medical diagnosis.

Decision Tree

A Decision Tree is a non-linear model that splits the data based on certain conditions,
creating a tree-like structure. It can be used for both classification and regression tasks.

Steps:

• Load the dataset and preprocess it.

• Train a Decision Tree model.

• Evaluate its performance

Decision Trees are used in tasks like fraud detection, customer churn prediction, and
diagnosing medical conditions.

Random Forest

A Random Forest is an ensemble method that builds multiple Decision Trees and merges
their results to improve accuracy and prevent overfitting.

Steps:

• Prepare the dataset.

• Train a Random Forest model.

• Evaluate the performance.

Random Forests are commonly used for tasks like image classification, risk assessment,
and sentiment analysis due to their robustness and ability to handle large datasets.
Practical - 14
• Students will create one mini project in group.

-----------------------------------------------------------------------------------------------------------------------------
22SS02IT113
OMBALAR

Practical - 14
• Students will create one mini project in group. •

Importing Required Libraries:-

Importing required libraries refers to including external code or modules at the
beginning of a program so that
their functions, methods, and classes can be used within the program. These
libraries provide pre-built solutions
to common tasks, making it easier to develop software without writing
everything from scratch.

Importing Dataset:-
Importing a dataset refers to the process of loading data from an external file
(such as CSV, Excel, or database) into a programming environment or software
(like Python, R, or Excel) for analysis or processing. It allows you to access and
manipulate the data for tasks like cleaning, visualization, or building models.

Data Pre-processing:-
22SS02IT113
OMBALAR
Data preprocessing is a technique in machine learning and data science used to
prepare raw data for analysis or modeling. It involves cleaning, transforming, and
organizing data to ensure it's in the right format for algorithms to process.

Data Information:-

Data Describe:-

Data Shape:-
22SS02IT113
OMBALAR

Data Visualization:-
Data visualization is the graphical representation of data using charts, graphs,
and maps. It helps to make complex data easier to understand by highlighting
patterns, trends, and insights.

Bar Chart:-
A bar chart is a data visualization tool that uses rectangular bars to represent
the values of different categories. Each bar's length is proportional to the
value it represents, making it easy to compare quantities across categories.
Bar charts are commonly used to display categorical data, such as sales
figures, survey results, or demographic information. They can be oriented
vertically or horizontally, with the x-axis typically representing the
categories and the y-axis showing the values. The clear visual distinction
between bars helps in quickly identifying trends, differences, and patterns in
the data.
22SS02IT113
OMBALAR

Scatterplot:-
A scatterplot is a data visualization technique that displays values for two
variables as points on a Cartesian coordinate system. Each point represents an
observation, with one variable plotted along the x-axis and the other along the y-
axis. Scatterplots are commonly used to identify relationships or correlations
between the two variables, such as trends, clusters, or outliers. The distribution
of points can indicate positive, negative, or no correlation, helping to reveal
patterns and insights in the data. They are particularly useful in exploratory data
analysis and can also be enhanced with additional elements like colors or sizes
to represent other dimensions of data.
22SS02IT113
OMBALAR

Machine Learning Models Implementation:-

Machine learning model implementation involves the process of applying
algorithms and statistical techniques to build models that can learn from data
and make predictions or decisions without explicit programming. This typically
includes data preprocessing, where raw data is cleaned and transformed; model
selection, where the appropriate algorithm is chosen based on the problem;
training, where the model learns patterns from the data; validation, to ensure the
model's performance on unseen data; and finally, deployment, where the model
is integrated into an application or system for practical use.

LinearRegression:-
22SS02IT113
OMBALAR

Chart:-
22SS02IT113
OMBALAR
22SS02IT113
OMBALAR

Decision Tree:-
Decision trees are a popular and versatile machine learning algorithm used for
classification and regression tasks. They model decisions and their possible
consequences, visualizing them as a tree-like structure of nodes and branches.
22SS02IT113
OMBALAR

Chart:-
22SS02IT113
OMBALAR
22SS02IT113
OMBALAR

Machine Learning Algorithms Differences:-

22SS02IT113
OMBALAR
22SS02IT113
OMBALAR
22SS02IT113
OMBALAR
22SS02IT113
OMBALAR

Which Machine Learning Model Is Best For These Dataset?

22SS02IT113
OMBALAR

Data Analysis With Python
No ratings yet
Data Analysis With Python
29 pages
Data Analysis With Python & Pandas
100% (2)
Data Analysis With Python & Pandas
378 pages
DOC-20250315-WA0005.
No ratings yet
DOC-20250315-WA0005.
29 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
DAL EXT 1 and 2
No ratings yet
DAL EXT 1 and 2
125 pages
Internship
No ratings yet
Internship
31 pages
Python Data Science Group Bootcamp NYC (Affordable Machine Learning)
No ratings yet
Python Data Science Group Bootcamp NYC (Affordable Machine Learning)
16 pages
Dsbda Ass1
No ratings yet
Dsbda Ass1
61 pages
Exp No. 1-3 (MLC)
No ratings yet
Exp No. 1-3 (MLC)
12 pages
final dev record
No ratings yet
final dev record
49 pages
Python_for_DataScience
No ratings yet
Python_for_DataScience
47 pages
Data Science Workshop - Day 1
No ratings yet
Data Science Workshop - Day 1
80 pages
Data Science lecture 5 6th semster
No ratings yet
Data Science lecture 5 6th semster
3 pages
01 Introduction to Python
No ratings yet
01 Introduction to Python
36 pages
Unit 5 PythonPackages (Numpy,Pandas,Tkinter)
No ratings yet
Unit 5 PythonPackages (Numpy,Pandas,Tkinter)
68 pages
01 Introduction to Python
No ratings yet
01 Introduction to Python
36 pages
Datascienceusing Python Training
No ratings yet
Datascienceusing Python Training
11 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Data Analysis Python Read The Docs Io en Latest
No ratings yet
Data Analysis Python Read The Docs Io en Latest
79 pages
Python ETL -Course Content
No ratings yet
Python ETL -Course Content
4 pages
Python Data Analysis Second Edition Armando Fandangoinstant download
100% (1)
Python Data Analysis Second Edition Armando Fandangoinstant download
54 pages
Python, Data Analysis, Data Visualization, Machine Learning, Python With Data Science
No ratings yet
Python, Data Analysis, Data Visualization, Machine Learning, Python With Data Science
11 pages
Data Analysis with Python
No ratings yet
Data Analysis with Python
51 pages
Report File
No ratings yet
Report File
40 pages
Data Analysis Using Python Day_1 to Day_4
No ratings yet
Data Analysis Using Python Day_1 to Day_4
30 pages
Data Analytics Curriculum
No ratings yet
Data Analytics Curriculum
8 pages
fds_merged (3) (1)
No ratings yet
fds_merged (3) (1)
102 pages
Data Science With Python - Lesson 07 - Data Manipulation With Python - Pandas
No ratings yet
Data Science With Python - Lesson 07 - Data Manipulation With Python - Pandas
72 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
jenisha INTERNSHIP REPORT-2.docx (1)
No ratings yet
jenisha INTERNSHIP REPORT-2.docx (1)
19 pages
ML File Updated
No ratings yet
ML File Updated
60 pages
Course_ Introduction to Data Science (SD211105)
No ratings yet
Course_ Introduction to Data Science (SD211105)
10 pages
FDS_LAB_MANUAL (1)
No ratings yet
FDS_LAB_MANUAL (1)
62 pages
RR
No ratings yet
RR
35 pages
Report
No ratings yet
Report
18 pages
Unit 7: Problem Solving Real World Programming Problems
No ratings yet
Unit 7: Problem Solving Real World Programming Problems
36 pages
Data Science Machine Learning 17054
No ratings yet
Data Science Machine Learning 17054
27 pages
Python Data Science - A Beginner's Guide To Mastering Analysis, Visualization, and Machine Learning by A. Eich Liana
No ratings yet
Python Data Science - A Beginner's Guide To Mastering Analysis, Visualization, and Machine Learning by A. Eich Liana
86 pages
suraj report file
No ratings yet
suraj report file
17 pages
data science
No ratings yet
data science
42 pages
Data Processing with Python and R
No ratings yet
Data Processing with Python and R
6 pages
Getting Started With Python Data Analysis - Sample Chapter
0% (1)
Getting Started With Python Data Analysis - Sample Chapter
17 pages
Python For Data Analysis Unlocking Insightsguide Brian P pdf download
100% (1)
Python For Data Analysis Unlocking Insightsguide Brian P pdf download
87 pages
TY FDS Workbook
No ratings yet
TY FDS Workbook
56 pages
Practical Data Science
No ratings yet
Practical Data Science
121 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Advanced Python Lab
No ratings yet
Advanced Python Lab
17 pages
8 LO5 Lect 1
No ratings yet
8 LO5 Lect 1
16 pages
tool and lib in Data Science
No ratings yet
tool and lib in Data Science
32 pages
Rest of the Ip Project
No ratings yet
Rest of the Ip Project
26 pages
Data science lab
No ratings yet
Data science lab
61 pages
2.1 - Introduction To Data Analytics
No ratings yet
2.1 - Introduction To Data Analytics
32 pages
Data Analytics With PowerBI
No ratings yet
Data Analytics With PowerBI
27 pages
Learning Data Mining with Python Layton download pdf
100% (5)
Learning Data Mining with Python Layton download pdf
55 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
Bcse206l Fds Module-5 Smsatapathy
No ratings yet
Bcse206l Fds Module-5 Smsatapathy
74 pages
Data Science with Python: Unlocking the Power of Pandas and Numpy
From Everand
Data Science with Python: Unlocking the Power of Pandas and Numpy
Robert Johnson
No ratings yet
Refonte Learning Data Analytics Program Syllabus
No ratings yet
Refonte Learning Data Analytics Program Syllabus
19 pages
data science notes Mtech
No ratings yet
data science notes Mtech
115 pages
Data Storytelling with Generative AI using Python and Altair MEAP V05 Angelica Lo Duca - The newest ebook version is ready, download now to explore
100% (1)
Data Storytelling with Generative AI using Python and Altair MEAP V05 Angelica Lo Duca - The newest ebook version is ready, download now to explore
79 pages
Bulba Advanced Instructions
No ratings yet
Bulba Advanced Instructions
13 pages
data analyts resume
No ratings yet
data analyts resume
2 pages
Itis 1p97 Chapter 4 Notes
No ratings yet
Itis 1p97 Chapter 4 Notes
50 pages
WRITEUP
No ratings yet
WRITEUP
2 pages
Advanced Certificate Programme DS
No ratings yet
Advanced Certificate Programme DS
34 pages
Session 4 - Exploratory Data Analysis - 2025
No ratings yet
Session 4 - Exploratory Data Analysis - 2025
23 pages
sample template file for project
No ratings yet
sample template file for project
8 pages
Predictive modeling using R
No ratings yet
Predictive modeling using R
5 pages
(Wiley Series in Probability and Statistics) David C. Hoaglin (Editor)_ Frederick Mosteller (Editor)_ John W. Tukey (Editor) - Understanding Robust and E
No ratings yet
(Wiley Series in Probability and Statistics) David C. Hoaglin (Editor)_ Frederick Mosteller (Editor)_ John W. Tukey (Editor) - Understanding Robust and E
472 pages
Data Preparation & Exploration
No ratings yet
Data Preparation & Exploration
12 pages
Data Analysis 1
No ratings yet
Data Analysis 1
13 pages
Weather Patterns Analysis and Prediction
No ratings yet
Weather Patterns Analysis and Prediction
17 pages
Technical Internship Report - HR Dataset
No ratings yet
Technical Internship Report - HR Dataset
52 pages
Crime Type and Occurrence Prediction Using Machine Learning
No ratings yet
Crime Type and Occurrence Prediction Using Machine Learning
28 pages
Exploratory Data Analysis With NumPy and Matplotlib
No ratings yet
Exploratory Data Analysis With NumPy and Matplotlib
8 pages
Diabetes Data Analysis Using Python Report
No ratings yet
Diabetes Data Analysis Using Python Report
15 pages
Chapter 03 Exploring Data
No ratings yet
Chapter 03 Exploring Data
45 pages
Exploratory Data Analysis & Data Preprocessing
No ratings yet
Exploratory Data Analysis & Data Preprocessing
16 pages
Book review by Anang Tawiah_ Comprehensive Summary and Review of Practical Statistics for Data Scientists by Andrew Bruce, Peter Bruce, and Peter Gedeck
No ratings yet
Book review by Anang Tawiah_ Comprehensive Summary and Review of Practical Statistics for Data Scientists by Andrew Bruce, Peter Bruce, and Peter Gedeck
14 pages
Approaches in data science [Slides]
No ratings yet
Approaches in data science [Slides]
13 pages
Transport App Pitch Deck by Slidesgo
No ratings yet
Transport App Pitch Deck by Slidesgo
60 pages
Introduction
No ratings yet
Introduction
10 pages
Bipul Kumar Data Engineer 30days
No ratings yet
Bipul Kumar Data Engineer 30days
2 pages
Instant download Data Science Concepts and Practice 2nd Edition Vijay Kotu pdf all chapter
100% (9)
Instant download Data Science Concepts and Practice 2nd Edition Vijay Kotu pdf all chapter
65 pages
Exploratory Data Analysis Using R 1st Edition Ronald K. Pearson All Chapters Instant Download
100% (1)
Exploratory Data Analysis Using R 1st Edition Ronald K. Pearson All Chapters Instant Download
47 pages
Bd4151 Foundations of Data Science
No ratings yet
Bd4151 Foundations of Data Science
70 pages
Chapter 6 Introduction To Data Visualization - Introduction To Data Science
No ratings yet
Chapter 6 Introduction To Data Visualization - Introduction To Data Science
4 pages

DS FINAL

Uploaded by

DS FINAL

Uploaded by

School of LABORATORY MANUAL

Enrollment No.: Academic Year:

This is to certify that

of Computer Science and Engineering having Enrollment No. has completed

Marks Obtained: out of

Sign of Faculty Sign of Head of Department

2. Why learn Python for data analysis?

3. Python 2.7 v/s 3.4

6. Python Data Structures

Aim : Introduction to Python Environment

Basics of Python for Data Science:

Understand the differences between Python 2.7 and Python 3.4

Learn how to install Python

Write and run simple Python programs

Download Python 3.10 from the official website.

Aim: Python Libraries and Data Structures

Dictionaries are mutable.

changed after creation.

Syntax: Tuples are defined by placing elements inside parentheses ().

Summary of Key Differences:

Aim: Iteration and Conditional Constructs:

For Loop: Used to repeat a block of code a specific number of times.

While Loop: Continues to execute as long as a specified condition is true.

If Statement: Executes a block of code if a condition is true.

If-Else Statement: Provides an alternative block of code if the condition is false.

Elif Statement: Allows checking multiple conditions.

Key Python Libraries:

Aim: Exploratory Data Analysis in Python Using Pandas

Load the Dataset

b. Survival Rate by Gender

c. Box Plot for Fare by Class

• Load the Dataset

• Exploratory Data Analysis (EDA)

• Basic Operations and Insights

Step 1: Import Libraries

Step 2: Load the Dataset

Step 4: Data Cleaning

Step 5: Exploratory Data Analysis (EDA)

a. Loan Status Distribution

Step 6: Basic Operations and Insights

a. Average Loan Amount by Gender

b. Loan Status by Education Level

Aim: Data Munging in Python Using Pandas

Here’s a step-by-step guide on performing data munging using Pandas.

Steps for Data Munging

• Loading the Data

• Handling Missing Data

• Filtering and Sorting Data

• Encoding Categorical Variables

2. Handling Missing Data

3. Data Transformation: Data may need to be converted or transformed for consistency.

• Normalizing Numerical Data: Scaling numerical data to a common range.

• Creating New Features:

• Concatenating data Frames:

• Merging Data Frames:

Aim: Building a Predictive Model in Python

• Import the necessary libraries and dataset.

• Train a Logistic Regression model.

• Evaluate the model.

• Load the dataset and preprocess it.

• Train a Decision Tree model.

• Evaluate its performance

• Prepare the dataset.

• Train a Random Forest model.

• Evaluate the performance.

Importing Required Libraries:-

Machine Learning Models Implementation:-

Machine Learning Algorithms Differences:-

Which Machine Learning Model Is Best For These Dataset?

You might also like