0% found this document useful (0 votes)

4 views

ml file syllabus

The document outlines a series of experiments aimed at exploring Python programming, focusing on concepts such as classes, functions, and libraries like SciPy and Scikit-learn for data analysis and machine learning. It covers data preprocessing techniques, linear regression, decision tree algorithms (ID3 and C4.5), and includes code examples for each experiment. The aim is to demonstrate practical applications of Python in data science and machine learning.

Uploaded by

mohit.121322

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

ml file syllabus

Uploaded by

mohit.121322

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

EXPERIMENT - 1

AIM:
Exploring and demonstrating python

THEORY:
Python is a high-level, interpreted programming language known for its simplicity
and readability. It is widely used in various fields such as web development, data analysis,
machine learning, automation, and more. Python's syntax is designed to be easy to read and
write, making it an excellent choice for beginners and experienced programmers alike.

1. Classes and Objects: Python is an object-oriented programming (OOP) language, which

means it supports the concepts of classes and objects. A class is a blueprint for creating
objects, which are instances of the class. Classes encapsulate data and functions that
operate on that data. Objects are instances that hold the actual data and can use the class's
methods.

class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def greet(self):
return f'Hello, my name is {self.name} and I am {self.age} years old.'
# Create an instance of the class
person = Person('Adi', 19)
print(person.greet())

2. Functions: Functions are blocks of reusable code that perform a specific task. They allow
for modular and organized code, making it easier to manage and debug. Python functions
are defined using the def keyword followed by the function name and parameters.
def add(a, b):
return a + b
def subtract(a, b):
return a - b

# Using the functions

print(add(5, 3)) # Output: 8
print(subtract(5, 3)) # Output: 2

3. SciPy stands for "Scientific Python" and is an open-source Python library used for
scientific
and technical computing. It builds on NumPy and provides a large collection of mathematical
algorithms and convenience functions, making it easier to perform scientific and engineering
tasks. Here are a few key components of SciPy:

1
1. Linear Algebra: Provides functions for matrix operations, solving linear systems,
eigenvalue problems, and more.
2. Optimization: Contains functions for finding the minimum or maximum of functions
(optimization), including linear programming and curve fitting.
3. Integration: Offers methods for calculating integrals, including numerical integration and
ordinary differential equations (ODE) solvers.
4. Statistics: Includes functions for statistical distributions, hypothesis testing, and
descriptive statistics.
5. Signal Processing: Provides tools for filtering, signal analysis, and Fourier transforms.

Linear Algebra-
import numpy as np
from scipy import linalg

Creating a matrix
A = np.array([[1, 2], [3, 4]])

Computing the determinant

det = linalg.det(A)
print("Determinant:", det)

# Solving a linear system of equations

b = np.array([5, 6])
x = linalg.solve(A, b)
print("Solution:", x)

This code demonstrates how to compute the determinant of a matrix and solve a linear system
of
equations using SciPy's linear algebra module

OUTPUT

Statistics
from scipy import stats

# Creating a dataset
data = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5])

# Computing descriptive statistics

mean = np.mean(data)
std_dev = np.std(data)
median = np.median(data)
print("Mean:", mean)
2
print("Standard Deviation:", std_dev)
print("Median:", median)

# Performing a t-test
t_stat, p_value = stats.ttest_1samp(data, 3)
print("T-statistic:", t_stat)
print("P-value:", p_value)

This code demonstrates how to compute descriptive statistics and perform a t-test using
SciPy's
statistics module.

OUTPUT

Scikit-learn is an open-source Python library for machine learning. It is built on NumPy,

SciPy, and Matplotlib and provides simple and efficient tools for data analysis and modeling.

Here are a few key components of scikit-learn:

1. Supervised Learning: Involves training a model on a labeled dataset, meaning the input
data is paired with the correct output. Examples include classification and regression.
2. Unsupervised Learning: Involves training a model on data without labeled responses.
Examples include clustering and dimensionality reduction.
3. Model Selection and Evaluation: Tools for evaluating and comparing different models,
including cross-validation and various metrics.
4. Preprocessing: Functions for feature extraction, normalization, and data transformation to
prepare data for modeling.

Supervised Learning: Classification

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import classification_report, accuracy_score
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

iris = datasets.load_iris()
X = iris.data
y = iris.target

3
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

clf = SVC(kernel='linear')
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred))

This code demonstrates how to load the Iris dataset, preprocess the data, train a Support
Vector
Machine (SVM) classifier, make predictions, and evaluate the model.

OUTPUT

Unsupervised Learning: Clustering

from sklearn import datasets
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

iris = datasets.load_iris()
X = iris.data

kmeans = KMeans(n_clusters=3, random_state=42)

kmeans.fit(X)

# Plot the clusters

plt.scatter(X[:, 0], X[:, 1], c=kmeans.labels_, cmap='viridis')
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.title('KMeans Clustering')

4
plt.show()

This code demonstrates how to load the Iris dataset, train a KMeans clustering model, and
visualize the clusters.

OUTPUT

Model Selection and Evaluation: Cross-Validation

from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier

# Load the dataset

iris = datasets.load_iris()
X = iris.data
y = iris.target

# Train a Random Forest classifier with cross-validation

clf = RandomForestClassifier(n_estimators=100, random_state=42)
scores = cross_val_score(clf, X, y, cv=5)

# Print the cross-validation scores

print("Cross-Validation Scores:", scores)
print("Mean Cross-Validation Score:", scores.mean())

This code demonstrates how to use cross-validation to evaluate the performance of a Random
Forest classifier on the Iris dataset.

OUTPUT

Preprocessing: Feature Scaling

5
from sklearn.preprocessing import MinMaxScaler
import numpy as np

# Create a sample dataset

data = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])

# Apply Min-Max scaling

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(data)

# Print the scaled data

print("Scaled Data:\n", scaled_data)

rocessing: Feature Scaling

This code demonstrates how to apply Min-Max scaling to a sample dataset to normalize the
features.

OUTPUT:

6
EXPERIMENT - 2

AIM:
Perform Data Preprocessing like outlier detection, handling missing value, analyzing
redundancy and normalization on different datasets.

THEORY:
Data preprocessing is a crucial step in the machine learning pipeline. It ensures that the data
fed into models is clean, consistent, and formatted appropriately. Poor data quality can
significantly degrade the performance of machine learning algorithms.

1. Handling Missing Values

Missing values occur when no data value is stored for a variable in an observation.
2. Outlier Detection
Outliers are data points that differ significantly from others in the dataset. They can skew the
performance of models and affect accuracy.
3. Analyzing Redundancy
Redundancy occurs when two or more features provide the same information.
4. Normalization
Normalization scales the data to a standard range, especially useful when features have
different units or ranges.

CODE
# missing values
import pandas as pd

df = pd.read_csv('students.csv')

print("Original Data:\n", df)

df.fillna(df.mean(numeric_only=True), inplace=True)
print("\nAfter Filling Missing Values (with mean):\n", df)

# outlier detection
import pandas as pd

df = pd.read_csv('salaries.csv')
salaries = df['Salary']

Q1 = salaries.quantile(0.25)
Q3 = salaries.quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

7
outliers = df[(salaries < lower_bound) | (salaries > upper_bound)]

print("Outliers Detected:\n", outliers)

# redundancy analysis
import pandas as pd

df = pd.read_csv('products.csv')

redundant_columns = []
for col1 in df.columns:
for col2 in df.columns:
if col1 != col2 and df[col1].equals(df[col2]):
redundant_columns.append((col1, col2))

print("Redundant Columns:", redundant_columns)

# normalization
import pandas as pd

df = pd.read_csv('athletes.csv')

for col in ['Height', 'Weight']:

min_val = df[col].min()
max_val = df[col].max()
df[col + '_Normalized'] = (df[col] - min_val) / (max_val - min_val)
print("After Normalization:\n", df)

8
OUTPUT

9
EXPERIMENT - 3

AIM:
Write a program to implement Linear Regression using any appropriate dataset.

THEORY:
What is Linear Regression?
Linear Regression is a supervised machine learning algorithm used to model the relationship
between a dependent variable (target) and one or more independent variables (features). It
assumes a linear relationship between the variables — that is, the change in the target
variable is proportional to the change in the feature variable(s).

The goal of Linear Regression is to find the best-fitting straight line (regression line) that
minimizes the difference between the actual data points and the predicted values from the
model.

It is commonly used for predictive analysis, such as estimating sales, prices, or salaries based
on certain inputs.

CODE
import pandas as pd
from sklearn.linear_model import LinearRegression

# Load the dataset

data = pd.read_csv("salary_dataset.csv")

# Extract features and target

X = data[["YearsExperience"]]
y = data["Salary"]

# Create and train the model

model = LinearRegression()
model.fit(X, y)

# Display the coefficient and intercept

print(f"Model coefficient (slope): {model.coef_[0]}")
print(f"Model intercept: {model.intercept_}")

# Predict salary for a specific experience (e.g., 1.3 years)

experience = 1.3
predicted_salary = model.predict([[experience]])
print(f"Predicted salary for {experience} years of experience: {predicted_salary[0]:.2f}")

10
OUTPUT

11
EXPERIMENT - 4

AIM:
Write a program to exhibit the working of the decision tree based ID3 algorithm. With the
help of appropriate data set build the decision tree and classify a new sample.

THEORY:
ID3 is a decision tree algorithm developed by Ross Quinlan. It builds a decision tree from a
dataset by using a top-down, greedy approach to select the attribute that maximizes
Information Gain.

1. Entropy - Entropy measures the impurity or uncertainty in the dataset.

2. Information Gain (IG) - It tells us how much "information" a feature gives us about the
class.

At each step, ID3 selects the feature that maximizes Information Gain. This means it chooses
the attribute that best separates the data into classes.

CODE
import pandas as pd
import numpy as np
import math
from collections import Counter

data = [
['Sunny', 'Hot', 'High', 'Weak', 'No'],
['Sunny', 'Hot', 'High', 'Strong', 'No'],
['Overcast', 'Hot', 'High', 'Weak', 'Yes'],
['Rain', 'Mild', 'High', 'Weak', 'Yes'],
['Rain', 'Cool', 'Normal', 'Weak', 'Yes'],
['Rain', 'Cool', 'Normal', 'Strong', 'No'],
['Overcast', 'Cool', 'Normal', 'Strong', 'Yes'],
['Sunny', 'Mild', 'High', 'Weak', 'No'],
['Sunny', 'Cool', 'Normal', 'Weak', 'Yes'],
['Rain', 'Mild', 'Normal', 'Weak', 'Yes'],
['Sunny', 'Mild', 'Normal', 'Strong', 'Yes'],
['Overcast', 'Mild', 'High', 'Strong', 'Yes'],
['Overcast', 'Hot', 'Normal', 'Weak', 'Yes'],
['Rain', 'Mild', 'High', 'Strong', 'No']
]

columns = ['Outlook', 'Temperature', 'Humidity', 'Wind', 'PlayTennis']

df = pd.read_csv('play_tennis.csv')

12
def entropy(target_col):
values, counts = np.unique(target_col, return_counts=True)
return -np.sum([(counts[i]/np.sum(counts)) * math.log2(counts[i]/np.sum(counts)) for i in
range(len(values))])

def info_gain(data, split_attribute_name, target_name="PlayTennis"):

total_entropy = entropy(data[target_name])
vals, counts = np.unique(data[split_attribute_name], return_counts=True)

weighted_entropy = np.sum([
(counts[i]/np.sum(counts)) * entropy(data.where(data[split_attribute_name] ==
vals[i]).dropna()[target_name])
for i in range(len(vals))
])

return total_entropy - weighted_entropy

def ID3(data, original_data, features, target_attribute_name="PlayTennis",

parent_node_class=None):

if len(np.unique(data[target_attribute_name])) <= 1:
return np.unique(data[target_attribute_name])[0]

elif len(data) == 0:
return np.unique(original_data[target_attribute_name])[
np.argmax(np.unique(original_data[target_attribute_name], return_counts=True)[1])
]

elif len(features) == 0:
return parent_node_class

else:
parent_node_class = np.unique(data[target_attribute_name])[
np.argmax(np.unique(data[target_attribute_name], return_counts=True)[1])
]

item_values = [info_gain(data, feature, target_attribute_name) for feature in features]

best_feature_index = np.argmax(item_values)
best_feature = features[best_feature_index]

tree = {best_feature: {}}

features = [i for i in features if i != best_feature]

13
for value in np.unique(data[best_feature]):
sub_data = data.where(data[best_feature] == value).dropna()
subtree = ID3(sub_data, original_data, features, target_attribute_name,
parent_node_class)
tree[best_feature][value] = subtree

return tree

features = list(df.columns)
features.remove('PlayTennis')
tree = ID3(df, df, features)
print("Decision Tree:", tree)

def classify(sample, tree):

for attr in tree:
if sample[attr] in tree[attr]:
subtree = tree[attr][sample[attr]]
if isinstance(subtree, dict):
return classify(sample, subtree)
else:
return subtree
else:
return "Unknown"

new_sample = {'Outlook': 'Sunny', 'Temperature': 'Cool', 'Humidity': 'High', 'Wind': 'Strong'}

prediction = classify(new_sample, tree)
print("Prediction for new sample:", prediction)
OUTPUT

14
EXPERIMENT - 5

AIM:
Write a program to demonstrate the working of the decision tree based C4.5 algorithm.
With the help of data set used in above experiment build the decision tree and classify a new
sample.

THEORY:
C4.5 is a decision tree algorithm developed by Ross Quinlan as an extension of ID3. It
addresses many of ID3’s limitations, especially around continuous data, pruning, and
overfitting.
It is widely used for classification problems and forms the basis for more advanced
algorithms like C5.0 and Random Forest.

C4.5 improves over ID3’s Information Gain by using Gain Ratio, which penalizes attributes
with many values.

Advantages of C4.5
● Can handle both categorical and numerical data.
● Deals with missing values.
● Uses pruning to improve generalization.
● Uses Gain Ratio to prevent bias toward many-valued attributes.
● Widely used and robust for practical classification problems.

CODE
import pandas as pd
import numpy as np
import math

df = pd.read_csv('play_tennis.csv')

def entropy(target_col):
values, counts = np.unique(target_col, return_counts=True)
return -np.sum([(counts[i]/np.sum(counts)) * math.log2(counts[i]/np.sum(counts)) for i in
range(len(values))])

def split_info(data, split_attribute_name):

vals, counts = np.unique(data[split_attribute_name], return_counts=True)
return -np.sum([(counts[i]/np.sum(counts)) * math.log2(counts[i]/np.sum(counts)) for i in
range(len(vals))])

15
def gain_ratio(data, split_attribute_name, target_name="PlayTennis"):
ig = info_gain(data, split_attribute_name, target_name)
si = split_info(data, split_attribute_name)
return ig / si if si != 0 else 0

def info_gain(data, split_attribute_name, target_name="PlayTennis"):

total_entropy = entropy(data[target_name])
vals, counts = np.unique(data[split_attribute_name], return_counts=True)

weighted_entropy = np.sum([
(counts[i]/np.sum(counts)) * entropy(data.where(data[split_attribute_name] ==
vals[i]).dropna()[target_name])
for i in range(len(vals))
])

return total_entropy - weighted_entropy

def C45(data, original_data, features, target_attribute_name="PlayTennis",

parent_node_class=None):
if len(np.unique(data[target_attribute_name])) <= 1:
return np.unique(data[target_attribute_name])[0]

elif len(data) == 0:
return np.unique(original_data[target_attribute_name])[
np.argmax(np.unique(original_data[target_attribute_name], return_counts=True)[1])
]

elif len(features) == 0:
return parent_node_class

else:
parent_node_class = np.unique(data[target_attribute_name])[
np.argmax(np.unique(data[target_attribute_name], return_counts=True)[1])
]

item_values = [gain_ratio(data, feature, target_attribute_name) for feature in features]

best_feature_index = np.argmax(item_values)
best_feature = features[best_feature_index]

tree = {best_feature: {}}

features = [i for i in features if i != best_feature]

for value in np.unique(data[best_feature]):

16
sub_data = data.where(data[best_feature] == value).dropna()
subtree = C45(sub_data, original_data, features, target_attribute_name,
parent_node_class)
tree[best_feature][value] = subtree

return tree

features = list(df.columns)
features.remove('PlayTennis')
tree = C45(df, df, features)
print("C4.5 Decision Tree:", tree)

def classify(sample, tree):

for attr in tree:
if sample[attr] in tree[attr]:
subtree = tree[attr][sample[attr]]
if isinstance(subtree, dict):
return classify(sample, subtree)
else:
return subtree
else:
return "Unknown"

# Example prediction
new_sample = {'Outlook': 'Sunny', 'Temperature': 'Cool', 'Humidity': 'High', 'Wind': 'Strong'}
prediction = classify(new_sample, tree)
print("Prediction for new sample:", prediction)

OUTPUT

17
EXPERIMENT - 6

AIM:
Write a program to demonstrate the working of decision tree based CART algorithm.
Build the decision tree and classify a new sample using suitable dataset. Compare the
performance with that of ID, C4.5, and CART in terms of accuracy, recall, precision and
sensitivity.

THEORY:
ID3 Algorithm
ID3 (Iterative Dichotomiser 3) is one of the earliest decision tree algorithms. It uses
Information Gain as the splitting criterion, which tends to favor attributes with many distinct
values. ID3 works only with categorical data and does not handle missing values. It also lacks
pruning, which makes it prone to overfitting, especially on noisy datasets.
● Splitting criterion: Information Gain
● Data types supported: Categorical only
● Missing value handling: Not supported
● Pruning: Not performed
● Output tree: Multi-way
● Performance (general):
● Accuracy: Around 85–90%
● Precision: Approximately 0.82
● Recall/Sensitivity: Approximately 0.84
● F1-Score: Around 0.83

C4.5 Algorithm
C4.5 is an improvement over ID3, also developed by Ross Quinlan. It uses Gain Ratio as the
splitting criterion, which corrects the bias seen in Information Gain. C4.5 supports both
categorical and continuous features and can handle missing values effectively. It performs
post-pruning, which helps prevent overfitting and improves generalization.
● Splitting criterion: Gain Ratio
● Data types supported: Categorical and continuous
● Missing value handling: Supported
● Pruning: Post-pruning is applied
● Output tree: Multi-way
● Performance (general):
● Accuracy: Around 88–93%
● Precision: Approximately 0.86
● Recall/Sensitivity: Approximately 0.89
● F1-Score: Around 0.87

CART Algorithm
CART (Classification and Regression Trees) is a binary decision tree algorithm that uses the
Gini Index to determine the best splits. It supports both categorical and continuous features

18
and handles missing values well. CART constructs strictly binary trees and is capable of both
classification and regression, making it more versatile. It also includes a cost-complexity
pruning mechanism to avoid overfitting.
● Splitting criterion: Gini Index
● Data types supported: Categorical and continuous
● Missing value handling: Supported
● Pruning: Cost-complexity pruning is applied
● Output tree: Binary only
● Performance (general):
● Accuracy: Around 87–92%
● Precision: Approximately 0.85
● Recall/Sensitivity: Approximately 0.87
● F1-Score: Around 0.86

CODE
import pandas as pd
import numpy as np
from collections import Counter
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

df = pd.read_csv('play_tennis.csv')
data = df.values.tolist()
headers = df.columns.tolist()

def gini_index(groups, classes):

n_instances = float(sum([len(group) for group in groups]))
gini = 0.0
for group in groups:
size = float(len(group))
if size == 0:
continue
score = 0.0

for class_val in classes:

proportion = [row[-1] for row in group].count(class_val) / size
score += proportion ** 2
gini += (1 - score) * (size / n_instances)
return gini

def test_split(index, value, dataset):

left, right = [], []
for row in dataset:
if row[index] == value:

19
left.append(row)
else:
right.append(row)
return left, right

def get_split(dataset):
class_values = list(set(row[-1] for row in dataset))
best_index, best_value, best_score, best_groups = 999, None, 999, None
for index in range(len(dataset[0])-1):
for row in dataset:
groups = test_split(index, row[index], dataset)
gini = gini_index(groups, class_values)
if gini < best_score:
best_index, best_value, best_score, best_groups = index, row[index], gini, groups
return {'index': best_index, 'value': best_value, 'groups': best_groups}

def to_terminal(group):
outcomes = [row[-1] for row in group]
return max(set(outcomes), key=outcomes.count)

def split(node, max_depth, min_size, depth):

left, right = node['groups']
del(node['groups'])

if not left or not right:

node['left'] = node['right'] = to_terminal(left + right)
return

if depth >= max_depth:

node['left'], node['right'] = to_terminal(left), to_terminal(right)
return

if len(left) <= min_size:

node['left'] = to_terminal(left)
else:
node['left'] = get_split(left)
split(node['left'], max_depth, min_size, depth+1)

if len(right) <= min_size:

node['right'] = to_terminal(right)
else:
node['right'] = get_split(right)
split(node['right'], max_depth, min_size, depth+1)
20
def build_tree(train, max_depth, min_size):
root = get_split(train)
split(root, max_depth, min_size, 1)
return root

def predict(node, row):

if row[node['index']] == node['value']:
if isinstance(node['left'], dict):
return predict(node['left'], row)
else:
return node['left']
else:
if isinstance(node['right'], dict):
return predict(node['right'], row)
else:
return node['right']

def entropy(data):
labels = [row[-1] for row in data]
counter = Counter(labels)
total = len(data)
return -sum((count/total) * np.log2(count/total) for count in counter.values())

def info_gain(data, attr_index):

total_entropy = entropy(data)
values = set(row[attr_index] for row in data)
subsets = [[row for row in data if row[attr_index] == val] for val in values]
weighted_entropy = sum((len(subset)/len(data)) * entropy(subset) for subset in subsets)
return total_entropy - weighted_entropy

def gain_ratio(data, attr_index):

gain = info_gain(data, attr_index)
values = [row[attr_index] for row in data]
split_info = entropy([[v] for v in values])
return gain / split_info if split_info != 0 else 0

def majority_class(data):
return Counter([row[-1] for row in data]).most_common(1)[0][0]

def id3(data, features):

labels = [row[-1] for row in data]
21
if labels.count(labels[0]) == len(labels):
return labels[0]
if not features:
return majority_class(data)

gains = [info_gain(data, i) for i in features]

best_attr = features[gains.index(max(gains))]
tree = {headers[best_attr]: {}}
values = set(row[best_attr] for row in data)

for value in values:

subset = [row for row in data if row[best_attr] == value]
if not subset:
tree[headers[best_attr]][value] = majority_class(data)
else:
subtree = id3(subset, [i for i in features if i != best_attr])
tree[headers[best_attr]][value] = subtree
return tree

def c45(data, features):

labels = [row[-1] for row in data]
if labels.count(labels[0]) == len(labels):
return labels[0]
if not features:
return majority_class(data)

ratios = [gain_ratio(data, i) for i in features]

best_attr = features[ratios.index(max(ratios))]
tree = {headers[best_attr]: {}}
values = set(row[best_attr] for row in data)

for value in values:

subset = [row for row in data if row[best_attr] == value]
if not subset:
tree[headers[best_attr]][value] = majority_class(data)
else:
subtree = c45(subset, [i for i in features if i != best_attr])
tree[headers[best_attr]][value] = subtree
return tree

def predict_tree(tree, row):

if not isinstance(tree, dict):
return tree
attr = next(iter(tree))
22
index = headers.index(attr)
value = row[index]
if value in tree[attr]:
return predict_tree(tree[attr][value], row)
else:
return None

# Evaluate models
def evaluate_model(model_type):
true_labels = [row[-1] for row in data]
predictions = []

if model_type == 'ID3':
tree = id3(data, list(range(len(data[0]) - 1)))
for row in data:
pred = predict_tree(tree, row)
predictions.append(pred if pred else majority_class(data))
elif model_type == 'C4.5':
tree = c45(data, list(range(len(data[0]) - 1)))
for row in data:
pred = predict_tree(tree, row)
predictions.append(pred if pred else majority_class(data))
elif model_type == 'CART':
tree = build_tree(data, max_depth=5, min_size=1)
for row in data:
predictions.append(predict(tree, row))

accuracy = accuracy_score(true_labels, predictions)

precision = precision_score(true_labels, predictions, pos_label='Yes', zero_division=0)
recall = recall_score(true_labels, predictions, pos_label='Yes', zero_division=0)
f1 = f1_score(true_labels, predictions, pos_label='Yes', zero_division=0)

print(f"{model_type} Results:")
print("Accuracy:", round(accuracy, 3))
print("Precision:", round(precision, 3))
print("Recall / Sensitivity:", round(recall, 3))
print("F1-Score:", round(f1, 3))
print()

evaluate_model('ID3')
evaluate_model('C4.5')
evaluate_model('CART')

23
OUTPUT

24
EXPERIMENT - 7

AIM:
Build an Artificial Neural Network by implementing the Backpropagation algorithm and test
the same using appropriate data sets.

THEORY:
An Artificial Neural Network is a computational model inspired by the structure and
functioning of the biological brain. It is a key technique in the field of machine learning and
deep learning, used for recognizing complex patterns and solving problems like
classification, regression, and prediction.

Backpropagation:
Forward Pass: Input is passed through the network to generate output.
Loss Calculation: Difference between predicted and actual output is calculated using a loss
function (e.g., Mean Squared Error).
Backward Pass: Gradients are calculated using the chain rule to update weights and biases
(Backpropagation).
Weight Update: Weights are updated using gradient descent.

CODE
import numpy as np

def sigmoid(x):
return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
return x * (1 - x)

# XOR Dataset
X = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])

y = np.array([[0],
[1],
[1],
[0]])

input_layer_neurons = 2
hidden_layer_neurons = 2

25
output_neurons = 1

np.random.seed(1)
wh = np.random.uniform(size=(input_layer_neurons, hidden_layer_neurons))
bh = np.random.uniform(size=(1, hidden_layer_neurons))
wo = np.random.uniform(size=(hidden_layer_neurons, output_neurons))
bo = np.random.uniform(size=(1, output_neurons))

epochs = 10000
learning_rate = 0.1

for i in range(epochs):
# Forward Propagation
hidden_input = np.dot(X, wh) + bh
hidden_output = sigmoid(hidden_input)

final_input = np.dot(hidden_output, wo) + bo

predicted_output = sigmoid(final_input)

# Backward Propagation
error = y - predicted_output
d_predicted_output = error * sigmoid_derivative(predicted_output)

error_hidden_layer = d_predicted_output.dot(wo.T)
d_hidden_layer = error_hidden_layer * sigmoid_derivative(hidden_output)

# Update weights and biases

wo += hidden_output.T.dot(d_predicted_output) * learning_rate
bo += np.sum(d_predicted_output, axis=0, keepdims=True) * learning_rate
wh += X.T.dot(d_hidden_layer) * learning_rate
bh += np.sum(d_hidden_layer, axis=0, keepdims=True) * learning_rate

print("Final Output after Training:")

print(np.round(predicted_output, 3))

OUTPUT

26
EXPERIMENT - 8

AIM:
Write a program to implement the Naïve Bayesian classifier for appropriate dataset and
compute the performance measures of the model.

THEORY:
Naïve Bayes is a probabilistic machine learning algorithm based on Bayes’ Theorem,
particularly useful for classification tasks. It assumes that all features are independent of each
other, which is often not true in practice, but still gives good results—hence the name
"naïve."
Bayes’ Theorem:
P(H | X) = [P(X | H) * P(H)] / P(X)

Where:

● P(H∣X) = Posterior probability (Probability of hypothesis H given data X)

● P(X∣H) = Likelihood (Probability of data X given that hypothesis H is true)
● P(H) = Prior probability (Initial probability of hypothesis H)
● P(X) = Marginal probability (Total probability of data X)

CODE
from collections import defaultdict
import math

dataset = [
['Sunny', 'Hot', 'High', 'Weak', 'No'],
['Sunny', 'Hot', 'High', 'Strong', 'No'],
['Overcast', 'Hot', 'High', 'Weak', 'Yes'],
['Rain', 'Mild', 'High', 'Weak', 'Yes'],
['Rain', 'Cool', 'Normal', 'Weak', 'Yes'],
['Rain', 'Cool', 'Normal', 'Strong', 'No'],
['Overcast', 'Cool', 'Normal', 'Strong', 'Yes'],
['Sunny', 'Mild', 'High', 'Weak', 'No'],
['Sunny', 'Cool', 'Normal', 'Weak', 'Yes'],
['Rain', 'Mild', 'Normal', 'Weak', 'Yes'],
['Sunny', 'Mild', 'Normal', 'Strong', 'Yes'],
['Overcast', 'Mild', 'High', 'Strong', 'Yes'],
['Overcast', 'Hot', 'Normal', 'Weak', 'Yes'],
['Rain', 'Mild', 'High', 'Strong', 'No']
]

27
X = [row[:-1] for row in dataset]
y = [row[-1] for row in dataset]

classes = set(y)

def train_naive_bayes(X, y):

total_samples = len(y)
label_probs = defaultdict(float)
feature_probs = defaultdict(lambda: defaultdict(lambda: defaultdict(float)))

for i in range(total_samples):
label = y[i]
label_probs[label] += 1
for j in range(len(X[i])):
feature_value = X[i][j]
feature_probs[j][feature_value][label] += 1

for label in label_probs:

label_probs[label] /= total_samples

for feature_idx in feature_probs:

for value in feature_probs[feature_idx]:
for label in feature_probs[feature_idx][value]:
feature_probs[feature_idx][value][label] /= label_probs[label] * total_samples

return label_probs, feature_probs

# Prediction
def predict_naive_bayes(sample, label_probs, feature_probs):
scores = {}
for label in label_probs:
log_prob = math.log(label_probs[label])
for i in range(len(sample)):
value = sample[i]
if value in feature_probs[i] and label in feature_probs[i][value]:
log_prob += math.log(feature_probs[i][value][label])
else:
log_prob += math.log(1e-6)
scores[label] = log_prob
return max(scores, key=scores.get)

# Train model
label_probs, feature_probs = train_naive_bayes(X, y)

# Test the model on a sample

28
test_sample = ['Sunny', 'Cool', 'High', 'Strong']
prediction = predict_naive_bayes(test_sample, label_probs, feature_probs)
print("Prediction for sample", test_sample, "=>", prediction)

OUTPUT

29
EXPERIMENT - 9

AIM:
Write a program to implement k-Nearest Neighbor algorithm to classify any dataset of your
choice. Print both correct and wrong predictions.

THEORY:
k-Nearest Neighbor is a supervised machine learning algorithm used for classification and
regression tasks. It is instance-based or lazy learning, meaning it doesn't learn a model during
training, but rather stores the training data and makes decisions during prediction. It classifies
a data point based on how its neighbors are classified.

How it Works (Steps):

● Choose a value of k (number of neighbors).
● Calculate the distance between the new input and all training points.
● Select the k closest points (neighbors).
● Perform majority voting among neighbors (for classification).
● Return the class with the most votes.

CODE
import numpy as np
from collections import Counter
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

def euclidean_distance(a, b):

return np.sqrt(np.sum((a - b) ** 2))

# k-NN algorithm
def knn_predict(X_train, y_train, test_row, k):
distances = []
for i in range(len(X_train)):
dist = euclidean_distance(test_row, X_train[i])
distances.append((dist, y_train[i]))
distances.sort(key=lambda x: x[0])
k_nearest_labels = [label for (_, label) in distances[:k]]
most_common = Counter(k_nearest_labels).most_common(1)
return most_common[0][0]

iris = load_iris()
X = iris.data
y = iris.target

# Split dataset

30
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Predict and track results

k=3
correct, wrong = 0, 0
print("Predictions:\n")
for i in range(len(X_test)):
prediction = knn_predict(X_train, y_train, X_test[i], k)
actual = y_test[i]
if prediction == actual:
correct += 1
print(f"Correct: Predicted={prediction}, Actual={actual}")
else:
wrong += 1
print(f"Wrong : Predicted={prediction}, Actual={actual}")

accuracy = correct / len(X_test) * 100

print(f"\nTotal Correct Predictions: {correct}")
print(f"Total Wrong Predictions : {wrong}")
print(f"Accuracy: {accuracy:.2f}%")

OUTPUT

31
EXPERIMENT - 10

AIM:
Apply k-Means clustering algorithm on suitable datasets and comment on the quality of
clustering.

THEORY:
What is K-Means?
K-Means is an unsupervised learning algorithm used for clustering data into groups (called
clusters). It groups data points such that those in the same cluster are more similar to each
other than to those in other clusters.
It is widely used in market segmentation, pattern recognition, image compression, and other
applications where labeled data is not available.

Key Concepts
Unsupervised: No labeled output is required; the algorithm tries to discover natural
groupings.
K: The number of clusters you want to divide your data into.
Centroid: The center of a cluster. It’s the average of all points in the cluster.

CODE
import csv
import random
import math

def load_dataset(filename):
with open(filename, 'r') as file:
reader = csv.reader(file)
next(reader) # skip header
dataset = []
for row in reader:
income = float(row[2])
score = float(row[3])
dataset.append([income, score])
return dataset

def euclidean_distance(a, b):

return math.sqrt(sum((a[i] - b[i]) ** 2 for i in range(len(a))))

def initialize_centroids(dataset, k):

return random.sample(dataset, k)

def assign_clusters(dataset, centroids):

32
clusters = [[] for _ in centroids]
for point in dataset:
distances = [euclidean_distance(point, centroid) for centroid in centroids]
cluster_idx = distances.index(min(distances))
clusters[cluster_idx].append(point)
return clusters

def update_centroids(clusters):
new_centroids = []
for cluster in clusters:
if cluster:
mean = [sum(col) / len(col) for col in zip(*cluster)]
new_centroids.append(mean)
else:
new_centroids.append([0] * len(clusters[0][0])) # placeholder
return new_centroids

def compute_wcss(clusters, centroids):

wcss = 0
for idx, cluster in enumerate(clusters):
for point in cluster:
wcss += euclidean_distance(point, centroids[idx]) ** 2
return wcss

def k_means(dataset, k=3, max_iters=100):

centroids = initialize_centroids(dataset, k)
for _ in range(max_iters):
clusters = assign_clusters(dataset, centroids)
new_centroids = update_centroids(clusters)
if new_centroids == centroids:
break
centroids = new_centroids
wcss = compute_wcss(clusters, centroids)
return clusters, centroids, wcss

if __name__ == "__main__":
dataset = load_dataset("customers.csv")
clusters, centroids, wcss = k_means(dataset, k=3)

for i, cluster in enumerate(clusters):

📉
print(f"Cluster {i+1}: {len(cluster)} customers")
print(f"\n WCSS (lower is better): {wcss:.2f}")

33
OUTPUT

34
EXPERIMENT - 11

AIM:
Write a program to implement ensemble algorithms - AdaBoost and Bagging using the
appropriate dataset and evaluate their performance on that dataset.

THEORY:
What are Ensemble Methods?
Ensemble methods are machine learning techniques that combine the predictions of multiple
models to produce a more accurate and robust prediction than any single model could
achieve. By aggregating several learners, ensemble methods help reduce overfitting, variance,
and bias.

Two popular ensemble techniques are:

AdaBoost (Adaptive Boosting):

AdaBoost is a boosting algorithm that builds a strong classifier by combining multiple weak
classifiers, typically decision trees with one level (decision stumps). It assigns weights to
each instance in the dataset, increasing the weights of incorrectly classified instances in each
iteration to focus more on them in the next model.

Bagging (Bootstrap Aggregating):

Bagging is an ensemble method that trains multiple models (usually of the same type) on
different random subsets of the training data and then averages their predictions (for
regression) or uses majority voting (for classification). Bagging helps reduce variance and
prevents overfitting.

CODE
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier, BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# AdaBoost
ada_model = AdaBoostClassifier(n_estimators=50)

35
ada_model.fit(X_train, y_train)
ada_preds = ada_model.predict(X_test)
print(f"AdaBoost Accuracy: {accuracy_score(y_test, ada_preds):.2f}")

# Bagging (use 'estimator' instead of 'base_estimator')

bag_model = BaggingClassifier(estimator=DecisionTreeClassifier(), n_estimators=50)
bag_model.fit(X_train, y_train)
bag_preds = bag_model.predict(X_test)
print(f"Bagging Accuracy: {accuracy_score(y_test, bag_preds):.2f}")

OUTPUT

36
EXPERIMENT - 12

AIM:
Select any two datasets based on their statistics and perform comparison among all the
implemented algorithms using them.

THEORY:

Model comparison involves evaluating the performance of multiple machine learning

algorithms on specific datasets to determine which performs best for a given task. This is
crucial because different models may yield varying results depending on the nature of the
dataset (classification or regression) and its features.

In this implementation, we compare:

● Classification algorithms on the Iris dataset

● Regression algorithms on the California Housing dataset

These comparisons help in understanding:

● The strengths and weaknesses of different models

● How well a model generalizes to unseen data

● The trade-offs between accuracy, interpretability, training time, etc.

CODE
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris, fetch_california_housing
from sklearn.linear_model import LogisticRegression, LinearRegression, Ridge
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.svm import SVC, SVR
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, precision_recall_fscore_support,
confusion_matrix, mean_squared_error, r2_score

# Load Iris Dataset (for classification)

iris = load_iris()
X_iris = iris.data
y_iris = iris.target

# Load California Housing Dataset (for regression)

37
california = fetch_california_housing()
X_california = california.data
y_california = california.target

# Split datasets into training and testing sets

X_train_iris, X_test_iris, y_train_iris, y_test_iris = train_test_split(X_iris, y_iris,
test_size=0.3, random_state=42)
X_train_california, X_test_california, y_train_california, y_test_california =
train_test_split(X_california, y_california, test_size=0.3, random_state=42)

# Classification Models (for Iris Dataset)

classifiers = {
'Logistic Regression': LogisticRegression(max_iter=200),
'Decision Tree': DecisionTreeClassifier(),
'Random Forest': RandomForestClassifier(),
'SVM': SVC(),
'K-Nearest Neighbors': KNeighborsClassifier(),
'Naive Bayes': GaussianNB()
}

# Regression Models (for California Housing Dataset)

regressors = {
'Linear Regression': LinearRegression(),
'Decision Tree Regressor': DecisionTreeRegressor(),
'Random Forest Regressor': RandomForestRegressor(),
'SVR': SVR(),
'K-Nearest Neighbors Regressor': KNeighborsRegressor(),
'Ridge Regression': Ridge()
}

# Function to evaluate classification models

def evaluate_classification_model(model, X_train, X_test, y_train, y_test):
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
precision, recall, f1, _ = precision_recall_fscore_support(y_test, y_pred,
average='weighted')
cm = confusion_matrix(y_test, y_pred)
return accuracy, precision, recall, f1, cm

# Function to evaluate regression models

def evaluate_regression_model(model, X_train, X_test, y_train, y_test):
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
38
r2 = r2_score(y_test, y_pred)
return mse, rmse, r2

# Evaluate Classification Models on Iris Dataset

classification_results = {}
for model_name, model in classifiers.items():
accuracy, precision, recall, f1, cm = evaluate_classification_model(model, X_train_iris,
X_test_iris, y_train_iris, y_test_iris)
classification_results[model_name] = {'Accuracy': accuracy, 'Precision': precision, 'Recall':
recall, 'F1-Score': f1, 'Confusion Matrix': cm}

# Evaluate Regression Models on California Housing Dataset

regression_results = {}
for model_name, model in regressors.items():
mse, rmse, r2 = evaluate_regression_model(model, X_train_california, X_test_california,
y_train_california, y_test_california)
regression_results[model_name] = {'MSE': mse, 'RMSE': rmse, 'R2': r2}

# Print Results for Classification

print("Classification Model Results (Iris Dataset):")
for model_name, result in classification_results.items():
print(f"\n{model_name}:")
print(f"Accuracy: {result['Accuracy']:.4f}")
print(f"Precision: {result['Precision']:.4f}")
print(f"Recall: {result['Recall']:.4f}")
print(f"F1-Score: {result['F1-Score']:.4f}")
print(f"Confusion Matrix:\n{result['Confusion Matrix']}\n")

# Print Results for Regression

print("\nRegression Model Results (California Housing Dataset):")
for model_name, result in regression_results.items():
print(f"\n{model_name}:")
print(f"MSE: {result['MSE']:.4f}")
print(f"RMSE: {result['RMSE']:.4f}")
print(f"R2: {result['R2']:.4f}")

39
OUTPUT

40
41
EXPERIMENT - 13

AIM:
Conduct survey (of at least five) different machine learning tools available.

THEORY:
Machine learning tools are software platforms or libraries that provide functionalities to
build, train, evaluate, and deploy machine learning models. These tools can range from
programming libraries to complete no-code platforms and are essential for data scientists and
ML engineers.

They differ based on:

● User Interface (Code-based vs. No-code)

● Level of abstraction (Low-level like TensorFlow vs. high-level like Scikit-learn)

● Use case (General-purpose vs. specialized like AutoML or computer vision)

● Integration with other tools (e.g., deployment on cloud or mobile)

SURVEY

Each machine learning tool surveyed in this experiment has unique strengths and is suited for
specific use cases:

● Scikit-learn is highly suitable for beginners and practitioners working with classical
machine learning algorithms such as regression, classification, and clustering. Its
simplicity and extensive documentation make it ideal for educational purposes and
prototyping.

● PyTorch is preferred in academic and research environments due to its flexibility and
dynamic computation graph, making it easier to debug and experiment with custom
deep learning models.

● TensorFlow is a powerful tool for building and deploying deep learning models at
scale. It is especially well-suited for production environments due to its robust
deployment features, including TensorFlow Serving and TensorFlow Lite.

● Google AutoML is designed for non-technical users or those looking for rapid model
development without the need for in-depth knowledge of machine learning. It
automates most of the model-building pipeline, including preprocessing, training, and
deployment.

42
● Weka is a GUI-based tool that is easy to use and valuable for educational purposes
and initial data analysis. However, it lacks the capabilities needed for modern deep
learning tasks.

In summary, the selection of a machine learning tool should be based on the specific
requirements of the task, including the complexity of the model, the level of user expertise,
and the intended deployment environment.

ELStraight - OPERATING MANUAL
100% (2)
ELStraight - OPERATING MANUAL
38 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
ML MANUAL
No ratings yet
ML MANUAL
21 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Ankit Python
No ratings yet
Ankit Python
26 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
MLCyberLab
No ratings yet
MLCyberLab
9 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
MACHINE LEARNING LAB WORD 12-1-2025. DOCUMENT
No ratings yet
MACHINE LEARNING LAB WORD 12-1-2025. DOCUMENT
68 pages
Datascience
No ratings yet
Datascience
8 pages
Machine Learning - Manual
No ratings yet
Machine Learning - Manual
32 pages
Data Mining Using Python Lab
100% (1)
Data Mining Using Python Lab
63 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
Kartik mlp 4-9prg (1)
No ratings yet
Kartik mlp 4-9prg (1)
10 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
ML(sudhanshu)
No ratings yet
ML(sudhanshu)
24 pages
External
No ratings yet
External
11 pages
MLC Practical
No ratings yet
MLC Practical
51 pages
ML
No ratings yet
ML
8 pages
Ml record_merged (1)
No ratings yet
Ml record_merged (1)
29 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
ML LAB_MANUAL
No ratings yet
ML LAB_MANUAL
15 pages
Ml Lab Manual
No ratings yet
Ml Lab Manual
59 pages
ML_LAB_MANUAL
No ratings yet
ML_LAB_MANUAL
12 pages
ML LabManual (1)
No ratings yet
ML LabManual (1)
16 pages
Scikit Learn
No ratings yet
Scikit Learn
17 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
ML LAB
No ratings yet
ML LAB
23 pages
data-mining-lab-manual-CSE-VII-Sem
No ratings yet
data-mining-lab-manual-CSE-VII-Sem
63 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
Ml Lab Manual Completed
No ratings yet
Ml Lab Manual Completed
56 pages
DVA Lab Manual
No ratings yet
DVA Lab Manual
20 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
ML-CONTENTHALF
No ratings yet
ML-CONTENTHALF
35 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
ML Pgms_24Mar2025
No ratings yet
ML Pgms_24Mar2025
23 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
cs229_python_friday
No ratings yet
cs229_python_friday
40 pages
Kabir Data Preprocessing Python
No ratings yet
Kabir Data Preprocessing Python
14 pages
Numpy Module
No ratings yet
Numpy Module
10 pages
ML[1]
No ratings yet
ML[1]
49 pages
Fundamentals of Data Science Students
No ratings yet
Fundamentals of Data Science Students
52 pages
ML Lab Manual (Upto Cie-1)
No ratings yet
ML Lab Manual (Upto Cie-1)
33 pages
fds_merged (3) (1)
No ratings yet
fds_merged (3) (1)
102 pages
ML Lab Manual (1-10) FINAL
No ratings yet
ML Lab Manual (1-10) FINAL
34 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
EE2211 CheatSheet
No ratings yet
EE2211 CheatSheet
15 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
data analytics lab manual
No ratings yet
data analytics lab manual
26 pages
DA_Programs
No ratings yet
DA_Programs
44 pages
Department of Computer Engineering Academic Term: June-Nov 2021
No ratings yet
Department of Computer Engineering Academic Term: June-Nov 2021
6 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
NT Converter 1.1
No ratings yet
NT Converter 1.1
8 pages
Algebra Ii With Trigonometry Exam
No ratings yet
Algebra Ii With Trigonometry Exam
8 pages
Cuet-Ug - 2024
No ratings yet
Cuet-Ug - 2024
3 pages
EMC Symmetrix Remote Data Facility (SRDF) : Connectivity Guide
No ratings yet
EMC Symmetrix Remote Data Facility (SRDF) : Connectivity Guide
150 pages
Shogun - Total War - The Mongol Invasion - Manual - PC
No ratings yet
Shogun - Total War - The Mongol Invasion - Manual - PC
73 pages
Bluesky UserManual
No ratings yet
Bluesky UserManual
6 pages
Unix Timestamp To Date Format - Ecreators Customer Support PDF
No ratings yet
Unix Timestamp To Date Format - Ecreators Customer Support PDF
4 pages
LG 32LG6000 LCDTV Service Manual
No ratings yet
LG 32LG6000 LCDTV Service Manual
36 pages
Gstr2a Excel Merging Utility
No ratings yet
Gstr2a Excel Merging Utility
7 pages
Motherboard Manual Ga-Ma78gm-S2h e
No ratings yet
Motherboard Manual Ga-Ma78gm-S2h e
100 pages
Configuring Gre Tunnels Over Ipsec
No ratings yet
Configuring Gre Tunnels Over Ipsec
39 pages
Logcathtcamaze 4 G
No ratings yet
Logcathtcamaze 4 G
101 pages
db2z 12 Instbook
No ratings yet
db2z 12 Instbook
1,024 pages
Research Proposal V1
No ratings yet
Research Proposal V1
3 pages
Microsoft Word 2010 LESSON 8
No ratings yet
Microsoft Word 2010 LESSON 8
17 pages
Arbitrary Precision Calculator
No ratings yet
Arbitrary Precision Calculator
11 pages
ReactJS PDF
No ratings yet
ReactJS PDF
403 pages
revision qp
No ratings yet
revision qp
5 pages
Syllabus V24-1 50701 41
No ratings yet
Syllabus V24-1 50701 41
8 pages
Low Cost Ventilator: Diploma in Electronics Engineering
No ratings yet
Low Cost Ventilator: Diploma in Electronics Engineering
73 pages
Expt 06 - PID Tunning Using Software and Modeling and Analysis of Mechanical System and Its Verification Using Suitable Simulation Software
No ratings yet
Expt 06 - PID Tunning Using Software and Modeling and Analysis of Mechanical System and Its Verification Using Suitable Simulation Software
6 pages
Crime Report Synopsis
No ratings yet
Crime Report Synopsis
10 pages
Ntru-Lpr Ind-Cpa: A New Ideal Lattice-Based Scheme
No ratings yet
Ntru-Lpr Ind-Cpa: A New Ideal Lattice-Based Scheme
20 pages
Reply Sheet 2024
No ratings yet
Reply Sheet 2024
17 pages
Differentiate Print and Non-Print Media
No ratings yet
Differentiate Print and Non-Print Media
5 pages
Section 01 Be Plus Pro Eng Tiếng Việt
No ratings yet
Section 01 Be Plus Pro Eng Tiếng Việt
186 pages
Compute Your Grades With Excel: AVERAGE Function
No ratings yet
Compute Your Grades With Excel: AVERAGE Function
4 pages
ENA&BBS Controller Reprogram Manual
No ratings yet
ENA&BBS Controller Reprogram Manual
8 pages
Ifm OGD592 20180314 IODD11 en
No ratings yet
Ifm OGD592 20180314 IODD11 en
14 pages