0% found this document useful (0 votes)
7 views

ML LAB_MANUAL

The document outlines a lab course in machine learning, focusing on various techniques and their implementation using Python. It includes objectives, outcomes, a list of experiments, and detailed instructions for using Python libraries like NumPy, SciPy, Pandas, and Matplotlib for data analysis and visualization. The course covers statistical measures, regression models, and performance analysis of classification algorithms.

Uploaded by

Swathi Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

ML LAB_MANUAL

The document outlines a lab course in machine learning, focusing on various techniques and their implementation using Python. It includes objectives, outcomes, a list of experiments, and detailed instructions for using Python libraries like NumPy, SciPy, Pandas, and Matplotlib for data analysis and visualization. The course covers statistical measures, regression models, and performance analysis of classification algorithms.

Uploaded by

Swathi Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Department of Computer Science and Engineering

Course Objective: The objective of this lab is to get an overview of the various machine learning
techniques and can demonstrate them using python.

Course Outcomes:

 Understand modern notions in predictive data analysis


 Select data, model selection, model complexity and identify the trends
 Understand a range of machine learning algorithms along with their strengths and weaknesses
 Build predictive models from data and analyze their performance

List of Experiments:

1. Write a python program to compute Central Tendency Measures: Mean, Median, Mode Measure of
Dispersion: Variance, Standard Deviation

2. Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy

3. Study of Python Libraries for ML application such as Pandas and Matplotlib

4. Write a Python program to implement Simple Linear Regression

5. Implementation of Multiple Linear Regression for House Price Prediction using sklearn

6. Implementation of Decision tree using sklearn and its parameter tuning

7. Implementation of KNN using sklearn

8. Implementation of Logistic Regression using sklearn

9. Implementation of K-Means Clustering

10. Performance analysis of Classification Algorithms on a specific dataset (Mini Project)


WEEK1:

1. Write a python program to compute Central Tendency Measures: Mean, Median, Mode
Measure of Dispersion: Variance, Standard Deviation

Solution:

import statistics

import math

#This line imports Python's built-in statistics module.The statistics module provides functions to perform
statistical calculations

#This line imports Python’s built-in math module.The math module provides mathematical functions that
go beyond basic arithmetic

l = [1, 3, 8, 15]

print(statistics.mean(l))

#The mean value is the average value.To calculate the mean, find the sum of all values, and divide the sum
by the number of values:

6.75

import statistics

s=[5,6,7,8,9,11]

print(statistics.mean(s))

print(statistics.median(s))

print(statistics.mean([1, 3, 5, 7, 9, 11, 13]))

print(statistics.mean([1, 3, 5, 7, 9, 11]))

print(statistics.mean([-11, 5.5, -3.4, 7.1, -9, 22]))

7.666666666666667

7.5

5
7

1.8666666666666667

# Calculate the median from a sample of data

print(statistics.median([1, 3, 5, 7, 9, 11, 13]))

print(statistics.median([1, 3, 5, 7, 9, 11]))

print(statistics.median([-11, 5.5, -3.4, 7.1, -9, 22]))

6.0

1.05

# Calculate the mode from a sample of data

print(statistics.mode([1, 3, 3, 3, 3,5, 7, 9, 11]))

print(statistics.mode([1, 1, 3, -5, 7, -9, 11]))

print(statistics.mode(['red', 'green', 'blue', 'red']))

red

print(statistics.variance([1, 3, 5, 7, 9, 11]))

print(statistics.variance([2, 2.5, 1.25, 3.1, 1.75, 2.8]))

print(statistics.variance([-11, 5.5, -3.4, 7.1]))

print(statistics.variance([1, 30, 50, 100]))

14

0.47966666666666663

70.80333333333333
1736.9166666666667

import statistics

def compute_statistics(data):

mean = statistics.mean(data)

median = statistics.median(data)

mode = statistics.mode(data)

variance = statistics.variance(data)

std_dev = statistics.stdev(data)

print(f"Mean: {mean}")

print(f"Median: {median}")

print(f"Mode: {mode}")

print(f"Variance: {variance}")

print(f"Standard Deviation: {std_dev}")

if __name__ == "__main__":

data = [1, 2, 2, 3, 4, 5, 5, 5, 6, 7]

compute_statistics(data)

Mean: 4

Median: 4.5

Mode: 5

Variance: 3.7777777777777777

Standard Deviation: 1.9436506316151


WEEK2: Implementation of Python Basic Libraries such as Math, Numpy and Scipy

Theory/Description:

Python Libraries There are a lot of reasons why Python is popular among developers and one of them is
that it has an amazingly large collection of libraries that users can work with. In this Python Library, we
will discuss Python Standard library and different libraries offered by Python Programming Language:
scipy, numpy,etc. We know that a module is a file with some Python code, and a package is a directory for
sub packages and modules. A Python library is a reusable chunk of code that you may want to include in
your programs/ projects. Here, a library loosely describes a collection of core modules. Essentially, then, a
library is a collection of modules. A package is a library that can be installed using a package manager like
numpy. Python Standard Library The Python Standard Library is a collection of script modules accessible
to a Python program to simplify the programming process and removing the need to rewrite commonly
used commands. They can be used by 'calling/importing' them at the beginning of a script. A list of the
Standard Library modules that are most important time sys csv math random pip os statistics tkinter socket
To display a list of all available modules, use the following command in the Python console:
>>>help('modules') 

List of important Python Libraries

Python Libraries for Data Collection

 Beautiful Soup
 Scrapy
 Selenium

Python Libraries for Data Cleaning and Manipulation

 Pandas
 PyOD
 NumPy
 Scipy
 Spacy

Python Libraries for Data Visualization

 Matplotlib
 Seaborn
 Bokeh
 NumPy (Numerical Python): Efficient numerical operations, arrays, and mathematical
computations.

 SciPy (Scientific Python): Built on top of NumPy, providing additional functionalities for
optimization, integration, statistics, and signal processing.

1. NumPy: Numerical Computation & Array Operations

NumPy provides a powerful n-dimensional array object (ndarray) and functions for numerical
computation.

1.1 Installing NumPy

pip install numpy

1.2 Basic NumPy Operations

import numpy as np

# Creating arrays

arr1 = np.array([1, 2, 3, 4, 5])

arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # 2D array

# Display arrays
print("1D Array:", arr1)

print("2D Array:\n", arr2)

# Array Properties

print("Shape:", arr2.shape) # (rows, columns)

print("Size:", arr2.size) # Total elements

print("Data Type:", arr2.dtype)

# Array Operations

print("Sum:", np.sum(arr1))

print("Mean:", np.mean(arr1))

print("Standard Deviation:", np.std(arr1))

# Element-wise Operations

print("Multiplication:", arr1 * 2)

print("Square Root:", np.sqrt(arr1))

# Creating Special Arrays

zeros = np.zeros((3,3)) # 3x3 matrix of zeros

ones = np.ones((2,2)) # 2x2 matrix of ones

identity = np.eye(3) # 3x3 identity matrix

# Random Numbers

rand_array = np.random.rand(3,3) # 3x3 random values


1.3 NumPy in Machine Learning

 Dataset Handling: Used to load, manipulate, and preprocess data.

 Linear Algebra: Matrix operations in deep learning and ML.

 Random Sampling: Initializing weights in neural networks.

2. SciPy: Scientific Computation & Advanced Operations

SciPy extends NumPy by adding modules for statistics, optimization, and signal processing.

2.1 Installing SciPy

pip install scipy

2.2 SciPy Modules & Examples

2.2.1 Optimization (scipy.optimize)

Used for solving mathematical optimization problems.

from scipy.optimize import minimize

# Define function to minimize (e.g., x^2 + 3x + 5)

def func(x):

return x**2 + 3*x + 5

result = minimize(func, x0=0) # Find minimum starting at x=0

print("Optimized Result:", result.x)

2.2.2 Linear Algebra (scipy.linalg)

from scipy.linalg import inv, det

A = np.array([[4, 7], [2, 6]])


print("Determinant:", det(A)) # Compute determinant

print("Inverse Matrix:\n", inv(A)) # Compute inverse

2.2.3 Statistics (scipy.stats)

from scipy import stats

data = [12, 15, 14, 10, 13, 18, 21, 19]

print("Mean:", np.mean(data))

print("Median:", np.median(data))

print("Mode:", stats.mode(data).mode[0])

print("Standard Deviation:", np.std(data))

2.2.4 Signal Processing (scipy.signal)

from scipy.signal import butter, filtfilt

# Low-pass filter

b, a = butter(3, 0.05) # 3rd order, cutoff 0.05

filtered_signal = filtfilt(b, a, np.sin(np.linspace(0, 10, 100)))

Week 3:
Study of Python Libraries for ML application such as Pandas and Matplotlib

1. Introduction to Python for ML

Machine Learning requires efficient data handling, processing, and visualization. Python provides several
libraries that make these tasks easier, among which Pandas (for data manipulation) and Matplotlib (for
visualization) are widely used.

2. Pandas: Data Handling & Manipulation


Pandas is a Python library used for data analysis and manipulation, built on top of NumPy.

2.1 Key Features

 DataFrames & Series: Core data structures for handling tabular and labeled data.

 Data Cleaning & Transformation: Handling missing values, filtering, merging, and reshaping
data.

 Descriptive Statistics: Mean, median, correlation, and other statistical operations.

 Integration: Works well with other ML libraries such as Scikit-learn, TensorFlow, and PyTorch.

2.2 Common Pandas Functions

import pandas as pd

# Creating a DataFrame

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Score': [85, 90, 95]}

df = pd.DataFrame(data)

# Display DataFrame

print(df)

# Basic Operations

print(df.describe()) # Summary statistics

print(df.head(2)) # First two rows

print(df.dtypes) # Data types of columns

# Data Manipulation

df['Age'] = df['Age'] + 1 # Modify values


df_filtered = df[df['Score'] > 85] # Filtering data

df_sorted = df.sort_values(by='Age') # Sorting data

2.3 Use Cases in ML

 Preprocessing: Cleaning, normalizing, and structuring datasets before feeding into ML models.

 Feature Engineering: Creating new features from existing data.

 Exploratory Data Analysis (EDA): Analyzing data distributions, correlations, and outliers.

3. Matplotlib: Data Visualization

Matplotlib is a powerful library for creating static, animated, and interactive visualizations.

3.1 Key Features

 Plotting Types: Line plots, bar charts, histograms, scatter plots, etc.

 Customization: Colors, labels, annotations, and styling.

 Integration: Works well with Pandas, NumPy, and Seaborn.

3.2 Common Matplotlib Functions

import matplotlib.pyplot as plt

# Sample Data

x = [1, 2, 3, 4, 5]

y = [10, 20, 25, 30, 50]

# Line Plot

plt.plot(x, y, marker='o', linestyle='-', color='b', label="Growth")

plt.xlabel("X-axis")

plt.ylabel("Y-axis")

plt.title("Simple Line Plot")

plt.legend()
plt.show()

# Scatter Plot

plt.scatter(x, y, color='r')

plt.title("Scatter Plot")

plt.show()

# Histogram

import numpy as np

data = np.random.randn(1000)

plt.hist(data, bins=30, color='g', alpha=0.7)

plt.title("Histogram")

plt.show()

3.3 Use Cases in ML

 Data Exploration: Understanding data distributions and trends.

 Feature Relationships: Identifying correlations between variables.

 Model Performance Evaluation: Visualizing errors, predictions, and accuracy.

4. Combining Pandas and Matplotlib for ML Applications

import pandas as pd

import matplotlib.pyplot as plt

# Load dataset (e.g., Titanic dataset)

df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")

# Data Preprocessing
df['Age'].fillna(df['Age'].median(), inplace=True)

# Plotting Age Distribution

plt.hist(df['Age'], bins=20, color='blue', alpha=0.7)

plt.xlabel("Age")

plt.ylabel("Count")

plt.title("Age Distribution of Titanic Passengers")

plt.show()

# Scatter Plot: Age vs Fare

plt.scatter(df['Age'], df['Fare'], alpha=0.5, color='red')

plt.xlabel("Age")

plt.ylabel("Fare")

plt.title("Age vs Fare")

plt.show()

Week 4:
Write a Python program to implement Simple Linear
Regression and plot the graph.
Linear Regression: Linear regression is defined as an algorithm that provides a linear relationship between
an independent variable and a dependent variable to predict the outcome of future events. It is a statistical
method used in data science and machine learning for predictive analysis. Linear regression is a supervised
learning algorithm that simulates a mathematical relationship between variables and makes predictions for
continuous or numeric variables such as sales, salary, age, product price, etc.

You might also like