0% found this document useful (0 votes)
10 views

Lab Exercise 2

Uploaded by

srujalprusty555
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lab Exercise 2

Uploaded by

srujalprusty555
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

LAB-EXERCISE FOR ML LAB(18-09-24)

1. Write a Python program to load the iris data from a given csv file into a dataframe and
print the shape of the data, type of the data and first 3 rows.
import pandas as pd
data = pd.read_csv("iris.csv")
print("Shape of the data:")
print(data.shape)
print("\nData Type:")
print(type(data))
print("\nFirst 3 rows:")
print(data.head(3))
2. Write a Python program using Scikit-learn to print the keys, number of rows-
columns, feature names and the description of the Iris data.
import pandas as pd
iris_data = pd.read_csv("iris.csv")
print("\nKeys of Iris dataset:")
print(iris_data.keys())
print("\nNumber of rows and columns of Iris dataset:")
print(iris_data.shape)
3. Write a Python program to get the number of observations, missing values and nan
values.
import pandas as pd
iris = pd.read_csv("iris.csv")
print(iris.info())
4. Write a Python program to view basic statistical details like percentile, mean, std etc.
of iris data.
import pandas as pd
data = pd.read_csv("iris.csv")
print(data.describe())
5. Write a Python program to drop Id column from a given Dataframe and print the
modified part. Call iris.csv to create the Dataframe.
import pandas as pd
data = pd.read_csv("iris.csv")
print("Original Data:")
print(data.head())
new_data = data.drop('Id',axis=1)
print("After removing id column:")
print(new_data.head())
6. Write a Python program to access first four cells from a given Dataframe using the
index and column labels. Call iris.csv to create the Dataframe.
import pandas as pd
data = pd.read_csv("iris.csv")
print("Original Data:")
print(data.head())
new_data = data.drop('Id',axis=1)
print("After removing id column:")
print(new_data.head())
x = data.iloc[:, [1, 2, 3, 4]].values
print(x)
7. Write a Python program to create a plot to get a general Statistics of Iris data.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
iris.describe().plot(kind = "area",fontsize=16, figsize = (15,8), table = True,
colormap="Accent")
plt.xlabel('Statistics',)
plt.ylabel('Value')
plt.title("General Statistics of Iris Dataset")
plt.show()
8. Write a Python program to create a Bar plot to get the frequency of the three species
of the Iris data.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
ax=plt.subplots(1,1,figsize=(10,8))
sns.countplot(x='Species',data=iris)
plt.title("Iris Species Count")
plt.show()
9. Write a Python program to create a Pie plot to get the frequency of the three species
of the Iris data.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
ax=plt.subplots(1,1,figsize=(10,8))
iris['Species'].value_counts().plot.pie(explode=[0.1,0.1,0.1],autopct='%1.1f%
%',shadow=True,figsize=(10,8))
plt.title("Iris Species %")
plt.show()
10. Write a Python program to create a graph to find relationship between the sepal length
and width.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=iris[iris.Species=='Iris-
Setosa'].plot(kind='scatter',x='SepalLengthCm',y='SepalWidthCm',color='orange',
label='Setosa')
iris[iris.Species=='Iris-
versicolor'].plot(kind='scatter',x='SepalLengthCm',y='SepalWidthCm',color='blue',
label='versicolor',ax=fig)
iris[iris.Species=='Iris-
virginica'].plot(kind='scatter',x='SepalLengthCm',y='SepalWidthCm',color='green',
label='virginica', ax=fig)
fig.set_xlabel("Sepal Length")
fig.set_ylabel("Sepal Width")
fig.set_title("Sepal Length VS Width")
fig=plt.gcf()
fig.set_size_inches(12,8)
plt.show()

11. Write a Python program to create a graph to find relationship between the petal length
and width.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=iris[iris.Species=='Iris-
setosa'].plot.scatter(x='PetalLengthCm',y='PetalWidthCm',color='orange',
label='Setosa')
iris[iris.Species=='Iris-
versicolor'].plot.scatter(x='PetalLengthCm',y='PetalWidthCm',color='blue',
label='versicolor',ax=fig)
iris[iris.Species=='Iris-
virginica'].plot.scatter(x='PetalLengthCm',y='PetalWidthCm',color='green',
label='virginica', ax=fig)
fig.set_xlabel("Petal Length")
fig.set_ylabel("Petal Width")
fig.set_title(" Petal Length VS Width")
fig=plt.gcf()
fig.set_size_inches(12,8)
plt.show()
12. Write a Python program to create a graph to see how the length and width of
SepalLength, SepalWidth, PetalLength, PetalWidth are distributed.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
# Drop id column
new_data = iris.drop('Id',axis=1)
new_data.hist(edgecolor='black', linewidth=1.2)
fig=plt.gcf()
fig.set_size_inches(12,12)
plt.show()
13. Write a Python program to create a joinplot to describe individual distributions on the
same plot between Sepal length and Sepal width.
Note: joinplot - Draw a plot of two variables with bivariate and univariate graphs.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=sns.jointplot(x='SepalLengthCm', y='SepalWidthCm',
data=iris, color='blue')
plt.show()

14. Write a Python program to create a joinplot using "hexbin" to describe individual
distributions on the same plot between Sepal length and Sepal width.
Note:
The bivariate analogue of a histogram is known as a "hexbin" plot, because it shows
the counts of observations that fall within hexagonal bins. This plot works best with
relatively large datasets. It's available through the matplotlib plt.hexbin function and
as a style in jointplot(). It looks best with a white background.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=sns.jointplot(x='SepalLengthCm', y='SepalWidthCm', kind="hex", color="red",
data=iris)
plt.show()
15. Write a Python program to create a joinplot using "kde" to describe individual
distributions on the same plot between Sepal length and Sepal width.
Note:
The kernel density estimation (kde) procedure visualize a bivariate distribution. In
seaborn, this kind of plot is shown with a contour plot and is available as a style in
jointplot().
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
iris = pd.read_csv("iris.csv")
fig=sns.jointplot(x='SepalLengthCm', y='SepalWidthCm', kind="kde", color='cyan',
data=iris)
plt.show()
16. Write a Python program to create a joinplot and add regression and kernel density fits
using "reg" to describe individual distributions on the same plot between Sepal length
and Sepal width.
17. Write a Python program to draw a scatterplot, then add a joint density estimate to
describe individual distributions on the same plot between Sepal length and Sepal
width.
18. Write a Python program to create a joinplot using "kde" to describe individual
distributions on the same plot between Sepal length and Sepal width and use '+' sign
as marker.
Note:
The kernel density estimation (kde) procedure visualize a bivariate distribution. In
seaborn, this kind of plot is shown with a contour plot and is available as a style in
jointplot().
19. Write a Python program to create a pairplot of the iris data set and check which flower
species seems to be the most separable.
20. Write a Python program to create a box plot (or box-and-whisker plot) which shows
the distribution of quantitative data in a way that facilitates comparisons between
variables or across levels of a categorical variable of iris dataset. Use seaborn.

You might also like