ML MANUAL
ML MANUAL
COLLEGE OF ENGINEERING
AND TECHNOLOGY
(AUTONOMOUS INSTITUTION)
Coimbatore - 641032
REG.NO :
NAME :
COURSE :
YEAR/SEM:
HINDUSTHAN
COLLEGE OF ENGINEERING AND TECHNOLOGY
(AUTONOMOUS INSTITUTION)
Coimbatore-641032.
DEPARTMENT OF COMPUTER
DEPARTMENT SCIENCE
OF COMPUTER SCIENCE AND ENGINEERING
AND ENGINEERING
Place : Coimbatore
Date:
Register Number:
Submitted for the 22CS5252 / MACHINE LEARNING LABORATORY practical examination
conducted on .
INTERNALEXAMINER EXTERNALEXAMINER
CONTENTS
PAGE
S.NO DATE EXPERIMENT MARKS SIGN
NO
Implementation of Basic Python Libraries
1 a)
(Math, Numpy, Scipy)
Implementation of Python Libraries for
1 b)
Machine Learning Applications (Pandas,
Matplotlib)
1 c) Creation and Loading of Datasets
STAFFINCHARGE
Ex.No: 01 a)
Implementation of Basic Python Libraries (Math, Numpy, Scipy)
Date:
Aim:
To explore and demonstrate basic mathematical operations and functionalities using Python's Math,
Numpy, and Scipy libraries.
Algorithm:
1. Use Math library for basic mathematical operations like square roots, trigonometry, and factorials.
2. Use Numpy for array creation, arithmetic operations on arrays, and statistical functions.
3. Use Scipy for advanced mathematical functions, such as integration and solving linear algebra
problems.
Program:
# Importing libraries
import math
import numpy as np
from scipy import integrate
from scipy import linalg
# Array operations
print("Array after adding 10:", array + 10)
print("Mean of array:", np.mean(array))
print("Standard deviation of array:", np.std(array))
print("Dot product of array with itself:", np.dot(array, array))
1
# Integration using Scipy
result, error = integrate.quad(lambda x: x**2, 0, 1)
print("Integration of x^2 from 0 to 1:", result)
Output:
Math Library Operations:
Square root of 16: 4.0
Factorial of 5: 120
Sin(45 degrees): 0.7071067811865476
Original matrix:
[[1 2]
[3 4]]
Matrix transpose:
[[1 3]
[2 4]]
Matrix determinant: -2.0000000000000004
Result:
Thus the given program is executed successfully and the output was verified
2
Ex.No: 01 b) Implementation of Python Libraries for Machine Learning
A Applications (Pandas, Matplotlib)
Date:
Aim:
To demonstrate essential data manipulation using Pandas and visualize data using Matplotlib.
Algorithm:
1. Use Pandas to load, inspect, filter, and analyze a dataset.
2. Use Matplotlib to plot and visualize the data trends.
Program:
# Importing libraries
import pandas as pd
import matplotlib.pyplot as plt
# Creating a DataFrame
df = pd.DataFrame(data)
print("Dataframe:\n", df)
# Data inspection
print("\nBasic Data Information:")
print("Data types:\n", df.dtypes)
print("Summary statistics:\n", df.describe())
# Filtering data
filtered_df = df[df['Age'] > 25]
print("\nFiltered data (Age > 25):\n", filtered_df)
plt.show()
# Plotting Age vs Salary as a scatter plot
3
plt.figure(figsize=(8, 5))
plt.scatter(df['Age'], df['Salary'], color='green')
plt.xlabel('Age')
plt.ylabel('Salary')
plt.title('Salary vs Age')
plt.show()
Output:
Pandas Library Operations:
Dataframe:
Data types:
Name :object
Age :int64
Salary :int64
Dtype : object
Summary statistics:
Age Salary
count 5.000000 5.000000
mean 26.800000 53800.000000
std 3.962323 6648.308055
min 22.000000 45000.000000
25% 24.000000 50000.000000
50% 27.000000 54000.000000
75% 29.000000 58000.000000
max 32.000000 62000.000000
4
Matplotlib Library Operations:
Result:
Thus the given program is executed successfully and the output was verified
5
Ex.No:01 c)
Creation and Loading of Datasets
Date:
Aim:
To understand different ways to create and load datasets in Python, including generating synthetic data,
loading from CSV files, and using pre-existing datasets from libraries.
Algorithm:
Program:
# Importing libraries
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification, load_iris
import seaborn as sns
6
# 4. Loading a Built-in Dataset Using Scikit-Learn
print("\nLoading Built-in Dataset with Scikit-Learn:")
# Load the Iris dataset
iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
print(iris_df.head())
Output:
Features:
Labels:
[1 1 0 0 0]
Sno A B C
1 0.264556 0.617635 0.359508
2 0.774234 0.612096 0.437032
3 0.456150 0.616934 0.697631
4 0.568434 0.943748 0.060225
5 0.018790 0.681820 0.666767
7
Loading Built-in Dataset with Scikit-Learn:
Sno sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
1 5.1 3.5 1.4 0.2
2 4.9 3.0 1.4 0.2
3 4.7 3.2 1.3 0.2
4 4.6 3.1 1.5 0.2
5 5.0 3.6 1.4 0.2
Result:
Thus the given program is executed successfully and the output was verified
8
Ex.No:02
Find-S Algorithm for Hypothesis Selection
Date:
Aim:
To implement and demonstrate the Find-S algorithm for finding the most specific hypothesis from a
given set of training data samples read from a CSV file.
Algorithm:
1. Initialize the hypothesis as the most specific hypothesis, where all attributes are set to "null" (?).
2. For each positive example:
o If this is the first positive example, set the hypothesis to match this example.
o For each subsequent positive example, update the hypothesis by generalizing attributes that
differ from the current hypothesis.
3. Return the final hypothesis after processing all positive examples.
Program:
Let's start by creating a sample CSV file with training data. The CSV file should contain rows with attribute
values and the class label (e.g., "Yes" or "No").
import csv
# Step 1: Read the CSV file and extract the relevant columns
def read_csv(file_path):
data = []
with open(file_path, mode='r') as file:
reader = csv.reader(file)
for row in reader:
data.append(row)
return data
9
if row[-1] == 'Yes': # Only consider positive instances (class 'Yes')
for i in range(len(hypothesis)):
# Generalize the hypothesis if needed
if hypothesis[i] != row[i]:
hypothesis[i] = '?' # Use '?' to denote any value (generalized)
return hypothesis
if __name__ == "__main__":
main()
Output:
Result:
Thus the given program is executed successfully and the output was verified
10
Ex.No:03
Support Vector Machine (SVM) Decision Boundary
Date:
Aim:
To create a synthetic dataset, train an SVM model, and plot the decision boundary.
Algorithm:
1. Generate a synthetic dataset with two features and two classes using Scikit-Learn's make_blobs.
2. Train an SVM model on the dataset.
3. Visualize the data points and plot the SVM decision boundary.
Program:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Step 2: Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Step 4: Train the SVM classifier (use a linear kernel for simplicity)
svm = SVC(kernel='linear', random_state=42)
svm.fit(X_train, y_train)
11
# Step 6: Plot the training points
plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, edgecolors='k', marker='o', s=100, cmap=plt.cm.coolwarm)
Output:
Result:
Thus the given program is executed successfully and the output was verified
12
Ex.No:04
Decision Tree Classification using ID3 Algorithm
Date:
Aim:
To demonstrate the working of the ID3 algorithm on a dataset and use the trained model to classify a
new sample.
Algorithm:
Program:
# Convert to DataFrame
df = pd.DataFrame(data)
# Encode the categorical variables (Weather, Temperature, PlayTennis)
df_encoded = pd.get_dummies(df)
13
prediction = clf.predict(new_sample)
print("Predicted class for the new sample:" ,{prediction[0]})
OUTPUT 1
OUTPUT 2
Result:
Thus the given program is executed successfully and the output was verified
14
Ex.No:05
Clustering Using EM (GMM) and k-Means Algorithms
Date:
Aim:
To cluster a dataset using EM and k-Means algorithms, compare their results, and evaluate the quality of
clustering.
Algorithm:
Program:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture
15
# Show the plots
plt.tight_layout()
plt.show()
# Optional: print cluster centers for K-Means and GMM component means
print("K-Means Cluster Centers:")
print(kmeans.cluster_centers_)
Output:
Result:
Thus the given program is executed successfully and the output was verified
16
Ex.No:06
k-Nearest Neighbor Classification
Date:
Aim:
To implement the k-NN algorithm on the Iris dataset and print both correct and incorrect predictions.
Algorithm:
Program:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
# Step 2: Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Step 3: Initialize and train the k-NN classifier (k=3 in this case)
k=3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
for i in range(len(y_test)):
if y_pred[i] == y_test[i]:
correct_predictions.append((X_test[i], y_test[i], y_pred[i]))
else:
incorrect_predictions.append((X_test[i], y_test[i], y_pred[i]))
17
# Step 7: Print incorrect predictions
print("\nIncorrect Predictions:")
for x, actual, predicted in incorrect_predictions:
print(f"Features: {x}, Actual: {iris.target_names[actual]}, Predicted: {iris.target_names[predicted]}")
Output:
Correct Predictions:
Incorrect Predictions:
Accuracy:100.00%
Result:
Thus the given program is executed successfully and the output was verified
18