CS3361 Data Science Lab Manual (II CYS)
CS3361 Data Science Lab Manual (II CYS)
EXNO:
DATE:
DOWNLOAD INSTALL AND EXPLORE THE FEATURES OF NUMPY, SCIPY,
JUPYTER, STATSMODELS AND PANDAS PACKAGES
AIM:
To download, install and explore the features of Numpy, Scipy, Jupyter, Statsmodels
and pandas packages.
ALGORITHM:
Step 1: Go to Command prompt.
Step 2: Type pip install Numpy.
Step3: Numpy packages have been installed.
Step 4: Type pip Scipy, Scipy packages get installed.
Step 5: Type pip install Jupyter, Jupyter packages get installed.
Step 6: Type pip install Statsmodels, the packages get installed.
Step 7: Type pip install pandas, the packages get installed.
Register no:411621104029
INSTALLATION PROCESS:
Numpy Installation: pip install numpy
Pandas installation:
RESULT:
Thus the working with commands executed successfully.
Register no:411621104029
EXNO:
DATE:
AIM:
To write a python code to implement the concept of Numpy arrays.
ALGORITHM:
Step 1: Create a numpy.
Step 2: Import numpy as np.
Step 3: And create an array
Step 4: Variable name= np.array {[]}
Register no:411621104029
RESULT:
Thus the working with Numpy array was completed successfully.
Register no:411621104029
EXNO:
DATE:
AIM:
To write a python code to implement the concept of Pandas Data frames.
ALGORITHM:
Step 1: import pandas.
Step 2: Create a data frame using List.
Step 3: Create Data frame from dict of ndarray /List.
Step 4: Delete the rows and columns.
Pandas Data Frame is two-dimensional size-mutable, potentially heterogeneous.
Tabular data structure with labelled axes (rows and columns). A data frame is a
twodimensional data structure i.e., data is aligned in a tabular fashion in rows and columns.
Pandas Data frame consists of three principle components the data, rows and columns.
PROGRAM & OUTPUT:
Creating a data frame using List:
RESULT:
Thus, the working with pandas Data Frame was completed successfully.
Register no:411621104029
EXNO:
DATE:
READING DATA FROM TEXT FILES, EXCEL AND THE WEB AND
EXPLORING VARIOUS COMMANDS DOING DESCRIPTIVE
ANALYTICS ON THE IRIS DATA SET
AIM:
To read the data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set.
READING DATA FROM TEXT FILE:
ALGORITHM:
Step 1: Open Notepad and type a text.
Step 2: Save that text to Desktop or any other Folder.
Step 3: Open pycharm and type code.
Step 4: Run the program. Step
5: The Output displays
Register no:411621104029
PROGRAM:
OUTPUT:
Link:https://www.kaggle.com/code/bharath25/descriptive-statistics-and-machine-learningiris/data
Step 2: Open the pycharm and type the following commands, Download Packages.
Step 3: The output will display.
iris. describe ()
iris.isnull ().sum ()
Register no:411621104029
iris.value_counts (“Species”
RESULT:
Thus, the program was executed successfully.
Register no:411621104029
EXNO:
DATE:
USE THE DIABETES DATA SET FROM UCI AND PIMA INDIANS
DIABETES DATA SET
AIM:
To use the diabetes data set from UCI and Pima Indians diabetes data set performing the
following.
a) Implement Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard
Deviation, Skewness and Kurtosis from UCI dataset.
b) Bivariate analysis: Linear and Logistic Regression Modeling.
c) Multiple Regression Analysis.
ALGORITHM:
Step 1: Download the Pima Indians Diabetes dataset
Link: https://www.kaggle.com/datasets/uciml/pima-indians-
diabetesdatabase?resource=download Step 2: Install Packages.
Step 3: Open the pycharm and type the following Commands.
Step 4: The output will display.
Register no:411621104029
PROGRAM:
5a) Univariate analysis:
Frequency, Mean, Median, Mode, Variance, Standard Deviation, Skewness and
Kurtosis.
print (df.shape)
print (df.info ())
LOGISTIC REGRESSION:
Register no:411621104029
RESULT:
Thus, the program was executed successfully.
Register no:411621104029
EXNO:
DATE:
AIM:
To apply and explore various plotting functions on UCI data sets.
a) Normal Curves.
b) Density and Contour Plots.
c) Correlation and Scatter Plots.
d) Histograms.
e) Three Dimensional Plotting.
ALGORITHM:
Step 1: Download Heart dataset from kaggle.
Link:https://www.kaggle.com/datasets/zhaoyingzhu/heartcsv
Step 2: Save that in downloads or any other Folder and install packages.
Step 3: Apply these following commands on the dataset.
Step 4: The Output will display.
PROGRAM:
Register no:411621104029
BOX PLOT:
a) Normal Curve:
Register no:411621104029
b) Density Plots:
Correlation plot
Scatter plot
Register no:411621104029
Histogram:
Register no:411621104029
RESULT:
Thus, the program was executed successfully.
Register no:411621104029
EXNO:
DATE:
AIM:
To create an insight Geographic Data with Basemap.
ALGORITHM:
Step 1: Install Basemap. The zip file occurs extract the original file.
Step 2: import Packages.
Step3: Save that in downloads or any other Folder.
Step 4: Apply these following commands.
Step 5: The Output will display.
plt.figure(figsize=(8, 8))
m = Basemap(projection='ortho', resolution=None, lat_0=50, lon_0=-100)
m.bluemarble(scale=0.5);
Register no:411621104029
fig
= plt.figure(figsize=(8, 8))
m = Basemap(projection='lcc', resolution=None,
width=8E6, height=8E6,
lat_0=45, lon_0=-100,)
m.etopo(scale=0.5, alpha=0.5)
RESULT:
EXP NO:
DATE:
PROGRAM
import pandas as pd
ds1=pd.Series([2,4,6,8,10])
ds2=pd.Series([1,3,5,7,9])
print("Add two series")
ds=ds1+ds2 print(ds)
print("Subtract two series")
ds=ds1-ds2 print(ds)
print("Multiply two series")
ds=ds1*ds2 print(ds)
print("Divide two series")
ds=ds1/ds2 print(ds)
OUTPUT
Register no:411621104029
RESULT
Thus, the program to perform arithmetic operations between two panda series has
been executed successfully
Register no:411621104029
EXPNO:
DATE:
SCATTER PLOTS IN PYTHON USING POKEMON DATASET
AIM
To perform a scatter plot in Python, using Matplotlib and Seaborn library with
Pokémon dataset.
ALGORITHM:
Step 1: Download Pokémon dataset from Kaggle.
Link: https://www.kaggle.com/datasets/rounakbanik/pokemon Step 2:
Save that in downloads or any other Folder and install packages.
Step 3: Apply these following commands on the dataset.
Step 4: The Output will display.
PROGRAM:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt import
seaborn as sns
OUTPUT:
Register no:411621104029
Register no:411621104029
RESULT: