100% found this document useful (1 vote)

1K views

CS3361 Data Science Lab Manual (II CYS)

The document discusses downloading and installing NumPy, SciPy, Jupyter, Statsmodels, and Pandas packages using pip in the command prompt. It then provides code snippets to create and work with NumPy arrays and Pandas dataframes, read data from text files and the web, perform descriptive analytics on the Iris dataset, use the Pima Indians diabetes dataset for univariate and bivariate analysis and multiple regression, apply various plotting functions to UCI datasets, and visualize geographic data with Basemap.

Uploaded by

rajananandh72138

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views

CS3361 Data Science Lab Manual (II CYS)

Uploaded by

rajananandh72138

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Register no:411621104029

EXNO:
DATE:
DOWNLOAD INSTALL AND EXPLORE THE FEATURES OF NUMPY, SCIPY,
JUPYTER, STATSMODELS AND PANDAS PACKAGES

AIM:
To download, install and explore the features of Numpy, Scipy, Jupyter, Statsmodels
and pandas packages.
ALGORITHM:
Step 1: Go to Command prompt.
Step 2: Type pip install Numpy.
Step3: Numpy packages have been installed.
Step 4: Type pip Scipy, Scipy packages get installed.
Step 5: Type pip install Jupyter, Jupyter packages get installed.
Step 6: Type pip install Statsmodels, the packages get installed.
Step 7: Type pip install pandas, the packages get installed.
Register no:411621104029

INSTALLATION PROCESS:
Numpy Installation: pip install numpy

Scipy Installation: pip install scipy

Jupyter Installation: pip install jupyter

Statsmodels installation: pip install statsmodels

Pandas installation:

RESULT:
Thus the working with commands executed successfully.
Register no:411621104029

EXNO:
DATE:

WORKING WITH NUMPY ARRAYS

AIM:
To write a python code to implement the concept of Numpy arrays.

ALGORITHM:
Step 1: Create a numpy.
Step 2: Import numpy as np.
Step 3: And create an array
Step 4: Variable name= np.array {[]}
Register no:411621104029

PROGRAM & OUTPUT:

RESULT:
Thus the working with Numpy array was completed successfully.
Register no:411621104029

EXNO:
DATE:

WORKING WITH PANDAS DATA FRAMES

AIM:
To write a python code to implement the concept of Pandas Data frames.

ALGORITHM:
Step 1: import pandas.
Step 2: Create a data frame using List.
Step 3: Create Data frame from dict of ndarray /List.
Step 4: Delete the rows and columns.
Pandas Data Frame is two-dimensional size-mutable, potentially heterogeneous.
Tabular data structure with labelled axes (rows and columns). A data frame is a
twodimensional data structure i.e., data is aligned in a tabular fashion in rows and columns.
Pandas Data frame consists of three principle components the data, rows and columns.
PROGRAM & OUTPUT:
Creating a data frame using List:

Creating Data frame from dict of ndarray/lists:

Dealing with Rows and Columns:

RESULT:
Thus, the working with pandas Data Frame was completed successfully.
Register no:411621104029

EXNO:
DATE:

READING DATA FROM TEXT FILES, EXCEL AND THE WEB AND
EXPLORING VARIOUS COMMANDS DOING DESCRIPTIVE
ANALYTICS ON THE IRIS DATA SET

AIM:
To read the data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set.
READING DATA FROM TEXT FILE:
ALGORITHM:
Step 1: Open Notepad and type a text.
Step 2: Save that text to Desktop or any other Folder.
Step 3: Open pycharm and type code.
Step 4: Run the program. Step
5: The Output displays
Register no:411621104029

PROGRAM:

OUTPUT:

IRIS DATA SET:

ALGORITHM:
Step 1: Download the IRIS dataset from the Kaggle website and save in Documents or any
other folder do you want.
Register no:411621104029

Link:https://www.kaggle.com/code/bharath25/descriptive-statistics-and-machine-learningiris/data
Step 2: Open the pycharm and type the following commands, Download Packages.
Step 3: The output will display.

iris. head (10)

iris. Shape iris.info

()
Register no:411621104029

iris. describe ()

iris.isnull ().sum ()
Register no:411621104029

iris.value_counts (“Species”

RESULT:
Thus, the program was executed successfully.
Register no:411621104029

EXNO:
DATE:

USE THE DIABETES DATA SET FROM UCI AND PIMA INDIANS
DIABETES DATA SET

AIM:
To use the diabetes data set from UCI and Pima Indians diabetes data set performing the
following.
a) Implement Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard
Deviation, Skewness and Kurtosis from UCI dataset.
b) Bivariate analysis: Linear and Logistic Regression Modeling.
c) Multiple Regression Analysis.

ALGORITHM:
Step 1: Download the Pima Indians Diabetes dataset
Link: https://www.kaggle.com/datasets/uciml/pima-indians-
diabetesdatabase?resource=download Step 2: Install Packages.
Step 3: Open the pycharm and type the following Commands.
Step 4: The output will display.
Register no:411621104029

PROGRAM:
5a) Univariate analysis:
Frequency, Mean, Median, Mode, Variance, Standard Deviation, Skewness and
Kurtosis.
print (df.shape)
print (df.info ())

Print (df.mean ())

Print (df.median ())

Print (df.mode ())

Print (df.std ())

Print (df.var ())

Print (df.skew ())
Register no:411621104029

Print (df.kurtosis ())

Df.describe ()
Register no:411621104029
Register no:411621104029
Register no:411621104029

5 b) Bivariate Analysis: Linear and Logistic Regression Modelling.

LOGISTIC REGRESSION:
Register no:411621104029

5 c) MULTIPLE REGRESSION ANALYSIS.

ALGORITHM:
Step 1: Import Libraries.
Step 2: Import dataset.
Step 3: Define x and y.
Step 4: Train the model on the training set.
Step 5: Predict the test set results.
Step 6: Evaluate the model.
Step 7: Plot the results.
Register no:411621104029
Register no:411621104029
Register no:411621104029

RESULT:
Thus, the program was executed successfully.
Register no:411621104029

EXNO:
DATE:

APPLY AND EXPLORE VARIOUS PLOTTING FUNCTIONS ON UCI

DATA SETS

AIM:
To apply and explore various plotting functions on UCI data sets.
a) Normal Curves.
b) Density and Contour Plots.
c) Correlation and Scatter Plots.
d) Histograms.
e) Three Dimensional Plotting.

ALGORITHM:
Step 1: Download Heart dataset from kaggle.
Link:https://www.kaggle.com/datasets/zhaoyingzhu/heartcsv
Step 2: Save that in downloads or any other Folder and install packages.
Step 3: Apply these following commands on the dataset.
Step 4: The Output will display.

PROGRAM:
Register no:411621104029

BOX PLOT:

a) Normal Curve:
Register no:411621104029

b) Density Plots:

c) Correlation and Scatter plots:

Correlation plot

Scatter plot
Register no:411621104029

Histogram:
Register no:411621104029

d) Three Dimensional Plotting:

RESULT:
Thus, the program was executed successfully.
Register no:411621104029

EXNO:
DATE:

VISUALIZING GEOGRAPHIC DATA WITH BASEMAP

AIM:
To create an insight Geographic Data with Basemap.

ALGORITHM:
Step 1: Install Basemap. The zip file occurs extract the original file.
Step 2: import Packages.
Step3: Save that in downloads or any other Folder.
Step 4: Apply these following commands.
Step 5: The Output will display.

PROGRAM & OUTPUT:

%matplotlib inline import
numpy as np import
matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap

plt.figure(figsize=(8, 8))
m = Basemap(projection='ortho', resolution=None, lat_0=50, lon_0=-100)
m.bluemarble(scale=0.5);
Register no:411621104029

fig
= plt.figure(figsize=(8, 8))
m = Basemap(projection='lcc', resolution=None,
width=8E6, height=8E6,
lat_0=45, lon_0=-100,)

m.etopo(scale=0.5, alpha=0.5)

# Map (long, lat) to (x, y) for plotting

x, y = m(-122.3, 47.6) plt.plot(x, y,
'ok', markersize=5)
plt.text(x, y, ' Seattle', fontsize=12);
Register no:411621104029

RESULT:

Thus, the program was executed successfully.

EXP NO:
DATE:

ARITHMETIC OPERATION BETWEEN TWO PANDA SERIES

AIM
To write a python program to perform arithmetic operation between two panda series
ALGORITHM
STEP 1: Start
STEP 2: Import pandas package
STEP 3: Initialise ds1 and ds2
STEP 4: For addition, calculate ds1+ds2
STEP 5: For subtraction, calculate ds1-ds2
STEP 6: For multiplication, calculate ds1*ds2
STEP 7: For division, calculate ds1/ds2
STEP 8: Print the desired results
STEP 9: Stop

PROGRAM
import pandas as pd
ds1=pd.Series([2,4,6,8,10])
ds2=pd.Series([1,3,5,7,9])
print("Add two series")
ds=ds1+ds2 print(ds)
print("Subtract two series")
ds=ds1-ds2 print(ds)
print("Multiply two series")
ds=ds1*ds2 print(ds)
print("Divide two series")
ds=ds1/ds2 print(ds)

OUTPUT
Register no:411621104029

RESULT
Thus, the program to perform arithmetic operations between two panda series has
been executed successfully
Register no:411621104029

EXPNO:
DATE:
SCATTER PLOTS IN PYTHON USING POKEMON DATASET
AIM
To perform a scatter plot in Python, using Matplotlib and Seaborn library with
Pokémon dataset.
ALGORITHM:
Step 1: Download Pokémon dataset from Kaggle.
Link: https://www.kaggle.com/datasets/rounakbanik/pokemon Step 2:
Save that in downloads or any other Folder and install packages.
Step 3: Apply these following commands on the dataset.
Step 4: The Output will display.
PROGRAM:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt import
seaborn as sns

data = pd.read_csv("../input/pokemon.csv") data.shape data.head() g1 =

data.loc[data.generation==1,:] # dataframe.plot.scatter() method g1.plot.scatter('attack',
'defense'); # The ';' is to avoid showing a message before showing the plot
# plt.scatter() function
plt.scatter('attack', 'defense', data=g1);
g1.plot.scatter('attack', 'defense', s = 40, c = 'orange', marker = 's', figsize=(8,5.5));
plt.figure(figsize=(10,7)) # Specify size of the chart plt.scatter('attack', 'defense',
data=data[data.is_legendary==1], marker = 'x', c = 'magenta') plt.scatter('attack',
'defense', data=data[data.is_legendary==0], marker = 'o', c = 'blue') plt.legend(('Yes',
'No'), title='Is legendary?') plt.show()
plt.figure(figsize=(10,7))
sns.scatterplot(x = 'attack', y = 'defense', s = 70, hue ='is_legendary', data=data); #
hue represents color plt.figure(figsize=(10,7)) sns.scatterplot(x = 'attack', y =
'defense', s = 50, hue = 'is_legendary', style ='is_legendary', data=data); #
style represents marker plt.figure(figsize=(11,7))
sns.scatterplot(x = 'attack', y = 'defense', s = 50, hue = 'type1', data=data)
Register no:411621104029

plt.legend(bbox_to_anchor=(1.02, 1)) # move legend to outside of the

chart plt.title('Defense vs Attack for All Pokemons', fontsize=16)
plt.xlabel('Attack', fontsize=12) plt.ylabel('Defense', fontsize=12)
plt.show()
water = data[data.type1 == 'water']
water.plot.scatter('height_m', 'weight_kg', figsize=(10,6)) plt.grid(True)
# add gridlines
plt.show()
water.plot.scatter('height_m', 'weight_kg', figsize=(10,6))
plt.grid(True)
for index, row in water.nlargest(5, 'height_m').iterrows():
plt.annotate(row['name'], # text to show
xy = (row['height_m'], row['weight_kg']), # the point to annotate
xytext = (row['height_m']+0.2, row['weight_kg']), # where to show the text
fontsize=12)
plt.xlim(0, ) # x-axis has minimum 0
plt.ylim(0, ) # y-axis has minimum 0
plt.show()

OUTPUT:
Register no:411621104029
Register no:411621104029

RESULT:

Thus, the above program was executed successfully.

Rice color sorting procuedures
0% (1)
Rice color sorting procuedures
16 pages
Data Analytics Using Python Lab Manual
50% (2)
Data Analytics Using Python Lab Manual
8 pages
Lab Manual: 18CS3262S Data Modelling and Visualization Techniques
33% (3)
Lab Manual: 18CS3262S Data Modelling and Visualization Techniques
17 pages
Scikit Learn Docs PDF
No ratings yet
Scikit Learn Docs PDF
2,387 pages
Programming and Scientific Computing in Python For Aerospace Engineers - J Hoekstra (TU Delft)
100% (1)
Programming and Scientific Computing in Python For Aerospace Engineers - J Hoekstra (TU Delft)
139 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
ML Lab Manual - Ex No. 1 To 9
No ratings yet
ML Lab Manual - Ex No. 1 To 9
26 pages
GE3171 - Python Lab Syllabus
100% (1)
GE3171 - Python Lab Syllabus
2 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
82 pages
OCS353 - Data Science Manual-FULL
No ratings yet
OCS353 - Data Science Manual-FULL
64 pages
DATA STRUCTURES DESIGN LAB Manual
No ratings yet
DATA STRUCTURES DESIGN LAB Manual
64 pages
WAS Lab Manual - Full
No ratings yet
WAS Lab Manual - Full
58 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
Lab Manual Daa Ad3351 Aids III Sem Regulation 2021
No ratings yet
Lab Manual Daa Ad3351 Aids III Sem Regulation 2021
48 pages
ccs346 Eda Lab Manual
No ratings yet
ccs346 Eda Lab Manual
41 pages
CCS341-Data Warehousing Lab Manual (2021)
100% (1)
CCS341-Data Warehousing Lab Manual (2021)
50 pages
A Basic Mips Implementation
No ratings yet
A Basic Mips Implementation
3 pages
Ad3301-Data-Exploration-And-Visualization Lab Manual
No ratings yet
Ad3301-Data-Exploration-And-Visualization Lab Manual
24 pages
Experiment 5
100% (1)
Experiment 5
6 pages
AD3251 Data Structures Design Question Bank 1
No ratings yet
AD3251 Data Structures Design Question Bank 1
1 page
GE3151 Problem Solving and Python Programming Question Bank 1
No ratings yet
GE3151 Problem Solving and Python Programming Question Bank 1
6 pages
UNIT 2 - Python Programming - QUESTION BANK - 2023-24
100% (1)
UNIT 2 - Python Programming - QUESTION BANK - 2023-24
2 pages
AD3351 DAA Lab Manual
No ratings yet
AD3351 DAA Lab Manual
47 pages
Ad3491 Fdsa Unit 3 Notes Eduengg
No ratings yet
Ad3491 Fdsa Unit 3 Notes Eduengg
37 pages
GE3151 Problem Solving and Python Programming Syllabus
No ratings yet
GE3151 Problem Solving and Python Programming Syllabus
1 page
GE3171 - PSPP Lab Manual Regulation 2021
No ratings yet
GE3171 - PSPP Lab Manual Regulation 2021
60 pages
Fundamentals of Data Science Lab Manual New1
No ratings yet
Fundamentals of Data Science Lab Manual New1
32 pages
AIML LAB MANAUAL R23
100% (1)
AIML LAB MANAUAL R23
10 pages
CS3491 Ai & ML Lab Manual
No ratings yet
CS3491 Ai & ML Lab Manual
57 pages
Data Science Lab Manual - CS3361-Ramprakash S
No ratings yet
Data Science Lab Manual - CS3361-Ramprakash S
47 pages
Python Record Final With Viva Question
No ratings yet
Python Record Final With Viva Question
100 pages
Question Paper - AI (Feb 1)
No ratings yet
Question Paper - AI (Feb 1)
2 pages
DVT - Question Bank
100% (1)
DVT - Question Bank
3 pages
Cd3291 Dsa Notes
100% (1)
Cd3291 Dsa Notes
168 pages
B Tech AIDS
No ratings yet
B Tech AIDS
43 pages
Untitled
No ratings yet
Untitled
4 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
FDS Lesson Plan
No ratings yet
FDS Lesson Plan
8 pages
SPC Book
No ratings yet
SPC Book
128 pages
CS6612 Compiler Lab Manual
100% (4)
CS6612 Compiler Lab Manual
60 pages
It8073 Information Security Reg 17 Question Bank
0% (1)
It8073 Information Security Reg 17 Question Bank
4 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
Unit I PSPP
50% (2)
Unit I PSPP
24 pages
Cs3362 C Programming and Data Structures Lab Ece
No ratings yet
Cs3362 C Programming and Data Structures Lab Ece
105 pages
Ccs341 - Data Warehousing
100% (1)
Ccs341 - Data Warehousing
2 pages
Task 1: Write A Program To Demonstrate Different Number Data Types in Python
No ratings yet
Task 1: Write A Program To Demonstrate Different Number Data Types in Python
10 pages
Final Copy Cp4291-Iot Lab Manual
No ratings yet
Final Copy Cp4291-Iot Lab Manual
49 pages
PSPP-Unit-wise Important Questions
100% (5)
PSPP-Unit-wise Important Questions
4 pages
MACHINE LEARNING AL3451
No ratings yet
MACHINE LEARNING AL3451
10 pages
Ad3491 Fdsa Unit 2 Notes Eduengg
No ratings yet
Ad3491 Fdsa Unit 2 Notes Eduengg
82 pages
Ccs334 Big Data Analytics
0% (1)
Ccs334 Big Data Analytics
2 pages
CS3452 Theory of Computation Apr May 2023 Question Paper Download
100% (2)
CS3452 Theory of Computation Apr May 2023 Question Paper Download
3 pages
cs3361 Data Science Lab Record Manual
89% (9)
cs3361 Data Science Lab Record Manual
92 pages
Cs3461 Operating System Lab Manual-1-4
100% (2)
Cs3461 Operating System Lab Manual-1-4
24 pages
DMW Question Paper
0% (1)
DMW Question Paper
7 pages
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
No ratings yet
Perform Data Preprocessing Tasks Using Labor Data Set in WEKA
6 pages
Data Structures Unit I Notes
No ratings yet
Data Structures Unit I Notes
16 pages
DSBDA - Mini Project Report
100% (1)
DSBDA - Mini Project Report
7 pages
Jansons Institute of Technology: Model Exam
No ratings yet
Jansons Institute of Technology: Model Exam
4 pages
Security Trends, Legal, Ethical and Professional Aspects of Security
No ratings yet
Security Trends, Legal, Ethical and Professional Aspects of Security
3 pages
FDS Aim Algorithm
No ratings yet
FDS Aim Algorithm
18 pages
ML MANUAL
No ratings yet
ML MANUAL
21 pages
final dev record
No ratings yet
final dev record
49 pages
SELFI Id Match
No ratings yet
SELFI Id Match
17 pages
589 SymPy Mechanics For Autolev Users - SymPy 1.11 Documentation
No ratings yet
589 SymPy Mechanics For Autolev Users - SymPy 1.11 Documentation
15 pages
?python For Data Analysis Cheatsheet
100% (3)
?python For Data Analysis Cheatsheet
128 pages
Introduction To Programming With Python: Libfexdlbdsipwp01
No ratings yet
Introduction To Programming With Python: Libfexdlbdsipwp01
188 pages
LightGBM - Release 2.2.4 PDF
No ratings yet
LightGBM - Release 2.2.4 PDF
183 pages
SciPy Programming Succinctly
No ratings yet
SciPy Programming Succinctly
122 pages
Python For Audio Processing
No ratings yet
Python For Audio Processing
8 pages
Mystic
No ratings yet
Mystic
218 pages
Computer Vision Resources
No ratings yet
Computer Vision Resources
7 pages
Scikit Learn Docs PDF
No ratings yet
Scikit Learn Docs PDF
2,663 pages
Sachin Shastri Resume
No ratings yet
Sachin Shastri Resume
1 page
Computational Modeling and Visualization of Physical Systems With Python
No ratings yet
Computational Modeling and Visualization of Physical Systems With Python
1,110 pages
UNIT - 4
No ratings yet
UNIT - 4
27 pages
ML Exp. 1-10 Output
No ratings yet
ML Exp. 1-10 Output
59 pages
Full Course of Machine Learning
100% (13)
Full Course of Machine Learning
660 pages
Python Interview
0% (1)
Python Interview
18 pages
Python Slides PDF
No ratings yet
Python Slides PDF
35 pages
PBSCPE 031 Software Development 3 Finals
No ratings yet
PBSCPE 031 Software Development 3 Finals
5 pages
Top Python Interview Questions and Answers
100% (1)
Top Python Interview Questions and Answers
14 pages
Introduction - IPython Interactive Computing and Visualization Cookbook - INTRODUCTION
No ratings yet
Introduction - IPython Interactive Computing and Visualization Cookbook - INTRODUCTION
7 pages
Geoplotlib Research Paper PDF
No ratings yet
Geoplotlib Research Paper PDF
21 pages
Get Guide to NumPy 1st Edition Oliphant Phd free all chapters
No ratings yet
Get Guide to NumPy 1st Edition Oliphant Phd free all chapters
82 pages
Python For Data Science
No ratings yet
Python For Data Science
8 pages
Python Solutions To Bessel Function Problems
No ratings yet
Python Solutions To Bessel Function Problems
11 pages
python计算机视觉编程
No ratings yet
python计算机视觉编程
175 pages
Huawei Confidential Huawei Confidential
No ratings yet
Huawei Confidential Huawei Confidential
33 pages
Introduction To Programming in Python
No ratings yet
Introduction To Programming in Python
79 pages