Yogesh Siddiq Edited
Yogesh Siddiq Edited
PROGRAM :1
# Initialize KMeans with 3 clusters (since the iris dataset has 3 classes)
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
# Visualize the clusters (for sepal length and sepal width features)
plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=predicted_clusters, s=50, cmap='viridis')
OUTPUT:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) \
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
species
0 setosa
1 setosa
2 setosa
3 setosa
4 setosa
PROGRAM : 2
K-MEANS CLUSTERING USING CANCER DATASET
# Initialize KMeans with 2 clusters (since breast cancer dataset has 2 classes: malignant and benign)
kmeans = KMeans(n_clusters=2, random_state=42)
# Visualize the clusters (for the first two features, which can represent the first two principal components)
plt.figure(figsize=(8, 6))
plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=predicted_clusters, s=50, cmap='viridis')
mean radius mean texture mean perimeter mean area mean smoothness \
0 17.99 10.38 122.80 1001.0 0.11840
1 20.57 17.77 132.90 1326.0 0.08474
2 19.69 21.25 130.00 1203.0 0.10960
3 11.42 20.38 77.58 386.1 0.14250
4 20.29 14.34 135.10 1297.0 0.10030
[5 rows x 30 columns]
PROGRAM 3:
plt.xlabel(diabetes_data.feature_names[0])
plt.ylabel(diabetes_data.feature_names[1])
plt.title('K-means Clustering on Diabetes Dataset')
plt.legend()
plt.show()
OUTPUT:
Shape of the original dataset (X): (442, 10)
Shape of the scaled dataset (X_scaled): (442, 10)
age sex bmi bp s1 s2 s3 \
0 0.038076 0.050680 0.061696 0.021872 -0.044223 -0.034821 -0.043401
1 -0.001882 -0.044642 -0.051474 -0.026328 -0.008449 -0.019163 0.074412
2 0.085299 0.050680 0.044451 -0.005670 -0.045599 -0.034194 -0.032356
3 -0.089063 -0.044642 -0.011595 -0.036656 0.012191 0.024991 -0.036038
4 0.005383 -0.044642 -0.036385 0.021872 0.003935 0.015596 0.008142
s4 s5 s6
0 -0.002592 0.019907 -0.017646
1 -0.039493 -0.068332 -0.092204
2 -0.002592 0.002861 -0.025930
3 0.034309 0.022688 -0.009362
4 -0.002592 -0.031988 -0.046641