0% found this document useful (0 votes)
0 views

6 Naive Bayesclassifn Algo

The document contains a Python script for analyzing the Iris dataset using pandas and scikit-learn. It includes data loading, exploration, and preparation steps such as splitting the dataset into training and testing sets. The script also sets up a Gaussian Naive Bayes model for classification tasks.

Uploaded by

omkarmagdum818
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

6 Naive Bayesclassifn Algo

The document contains a Python script for analyzing the Iris dataset using pandas and scikit-learn. It includes data loading, exploration, and preparation steps such as splitting the dataset into training and testing sets. The script also sets up a Gaussian Naive Bayes model for classification tasks.

Uploaded by

omkarmagdum818
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

cota12-6

March 25, 2025

[1]: '''name : Omkar Magdum


Rollno:COTC53'''
[1]: 'name : Omkar Magdum\n Rollno:COTC53'
[2]:
import pandas as pd
import matplotlib.pyplot as plt
[3]: data = pd.read_csv("iris.csv")

[4]: data.head()

[4] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
[5] : 4 5
data.shape
5.0 3.6 1.4 0.2 Iris-setosa

[5]: (150, 6)

[6] : data.head()

[6] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5
[7] : data.tail() 5.0 3.6 1.4 0.2 Iris-setosa

[7] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm \


145 146 6.7 3.0 5.2 2.3
146 147 6.3 2.5 5.0 1.9
147 148 6.5 3.0 5.2 2.0

1
148 149 6.2 3.4 5.4 2.3
149 150 5.9 3.0 5.1 1.8

Species
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
[8] : 149 Iris-virginica

data.info()

<class
'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype

0 Id 150 non-null int64


1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
[9] :
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB

[9] : Id SepalLengthCm SepalWidthCmPetalLengthCm PetalWidthCm


data.describe()
count 150.000000 150.000000 150.000000 150.000000 150.000000
mean 75.500000 5.843333 3.054000 3.758667 1.198667
std 43.445368 0.828066 0.433594 1.764420 0.763161
min 1.000000 4.300000 2.000000 1.000000 0.100000
25% 38.250000 5.100000 2.800000 1.600000 0.300000
50% 75.500000 5.800000 3.000000 4.350000 1.300000
[10] : 75%
data.isnull().sum()
112.750000 6.400000 3.300000 5.100000 1.800000
max 150.000000 7.900000 4.400000 6.900000 2.500000
[10] : Id 0
SepalLengthCm 0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0
dtype: int64

2
[11] : x = data.drop(['Species'], axis=1)
y = data.drop(['SepalLengthCm', 'SepalWidthCm',
'PetalLengthCm', ␣
𝗌'PetalWidthCm'], axis=1)
print(x)
print(y)
print(x.shape)
print(y.shape)
Id SepalLengthC SepalWidthC PetalLengthC PetalWidthC
m m m m
0 1 5.1 3.5 1.4 0.2
1 2 4.9 3.0 1.4 0.2
2 3 4.7 3.2 1.3 0.2
3 4 4.6 3.1 1.5 0.2
4 5 5.0 3.6 1.4 0.2
.. … … … … …
145 146 6.7 3.0 5.2 2.3
146 147 6.3 2.5 5.0 1.9
147 148 6.5 3.0 5.2 2.0
148 149 6.2 3.4 5.4 2.3
149 150 5.9 3.0 5.1 1.8

[150 rows x 5 columns]


Id Species
0 1 Iris-setosa
1 2 Iris-setosa
2 3 Iris-setosa
3 4 Iris-setosa
4 5 Iris-setosa
.. … …
145 146 Iris-virginica
146 147 Iris-virginica
147 148 Iris-virginica
148 149 Iris-virginica
149 150 Iris-virginica

[150 rows x 2 columns]


(150, 5)
(150, 2)
[12] :
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2,␣
𝗌shuffle=True)

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

3
(120, 5)
(30, 5)
(120, 2)
(30, 2)
[14]: from sklearn.naive_bayes import GaussianNB

[15]: GaussianNB()

[15]: GaussianNB()

You might also like