ML LAB_MANUAL
ML LAB_MANUAL
Course Objective: The objective of this lab is to get an overview of the various machine learning
techniques and can demonstrate them using python.
Course Outcomes:
List of Experiments:
1. Write a python program to compute Central Tendency Measures: Mean, Median, Mode Measure of
Dispersion: Variance, Standard Deviation
2. Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
5. Implementation of Multiple Linear Regression for House Price Prediction using sklearn
1. Write a python program to compute Central Tendency Measures: Mean, Median, Mode
Measure of Dispersion: Variance, Standard Deviation
Solution:
import statistics
import math
#This line imports Python's built-in statistics module.The statistics module provides functions to perform
statistical calculations
#This line imports Python’s built-in math module.The math module provides mathematical functions that
go beyond basic arithmetic
l = [1, 3, 8, 15]
print(statistics.mean(l))
#The mean value is the average value.To calculate the mean, find the sum of all values, and divide the sum
by the number of values:
6.75
import statistics
s=[5,6,7,8,9,11]
print(statistics.mean(s))
print(statistics.median(s))
print(statistics.mean([1, 3, 5, 7, 9, 11]))
7.666666666666667
7.5
5
7
1.8666666666666667
print(statistics.median([1, 3, 5, 7, 9, 11]))
6.0
1.05
red
print(statistics.variance([1, 3, 5, 7, 9, 11]))
14
0.47966666666666663
70.80333333333333
1736.9166666666667
import statistics
def compute_statistics(data):
mean = statistics.mean(data)
median = statistics.median(data)
mode = statistics.mode(data)
variance = statistics.variance(data)
std_dev = statistics.stdev(data)
print(f"Mean: {mean}")
print(f"Median: {median}")
print(f"Mode: {mode}")
print(f"Variance: {variance}")
if __name__ == "__main__":
data = [1, 2, 2, 3, 4, 5, 5, 5, 6, 7]
compute_statistics(data)
Mean: 4
Median: 4.5
Mode: 5
Variance: 3.7777777777777777
Theory/Description:
Python Libraries There are a lot of reasons why Python is popular among developers and one of them is
that it has an amazingly large collection of libraries that users can work with. In this Python Library, we
will discuss Python Standard library and different libraries offered by Python Programming Language:
scipy, numpy,etc. We know that a module is a file with some Python code, and a package is a directory for
sub packages and modules. A Python library is a reusable chunk of code that you may want to include in
your programs/ projects. Here, a library loosely describes a collection of core modules. Essentially, then, a
library is a collection of modules. A package is a library that can be installed using a package manager like
numpy. Python Standard Library The Python Standard Library is a collection of script modules accessible
to a Python program to simplify the programming process and removing the need to rewrite commonly
used commands. They can be used by 'calling/importing' them at the beginning of a script. A list of the
Standard Library modules that are most important time sys csv math random pip os statistics tkinter socket
To display a list of all available modules, use the following command in the Python console:
>>>help('modules')
Beautiful Soup
Scrapy
Selenium
Pandas
PyOD
NumPy
Scipy
Spacy
Matplotlib
Seaborn
Bokeh
NumPy (Numerical Python): Efficient numerical operations, arrays, and mathematical
computations.
SciPy (Scientific Python): Built on top of NumPy, providing additional functionalities for
optimization, integration, statistics, and signal processing.
NumPy provides a powerful n-dimensional array object (ndarray) and functions for numerical
computation.
import numpy as np
# Creating arrays
# Display arrays
print("1D Array:", arr1)
# Array Properties
# Array Operations
print("Sum:", np.sum(arr1))
print("Mean:", np.mean(arr1))
# Element-wise Operations
print("Multiplication:", arr1 * 2)
# Random Numbers
SciPy extends NumPy by adding modules for statistics, optimization, and signal processing.
def func(x):
print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Mode:", stats.mode(data).mode[0])
# Low-pass filter
Week 3:
Study of Python Libraries for ML application such as Pandas and Matplotlib
Machine Learning requires efficient data handling, processing, and visualization. Python provides several
libraries that make these tasks easier, among which Pandas (for data manipulation) and Matplotlib (for
visualization) are widely used.
DataFrames & Series: Core data structures for handling tabular and labeled data.
Data Cleaning & Transformation: Handling missing values, filtering, merging, and reshaping
data.
Integration: Works well with other ML libraries such as Scikit-learn, TensorFlow, and PyTorch.
import pandas as pd
# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'Score': [85, 90, 95]}
df = pd.DataFrame(data)
# Display DataFrame
print(df)
# Basic Operations
# Data Manipulation
Preprocessing: Cleaning, normalizing, and structuring datasets before feeding into ML models.
Exploratory Data Analysis (EDA): Analyzing data distributions, correlations, and outliers.
Matplotlib is a powerful library for creating static, animated, and interactive visualizations.
Plotting Types: Line plots, bar charts, histograms, scatter plots, etc.
# Sample Data
x = [1, 2, 3, 4, 5]
# Line Plot
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.legend()
plt.show()
# Scatter Plot
plt.scatter(x, y, color='r')
plt.title("Scatter Plot")
plt.show()
# Histogram
import numpy as np
data = np.random.randn(1000)
plt.title("Histogram")
plt.show()
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
# Data Preprocessing
df['Age'].fillna(df['Age'].median(), inplace=True)
plt.xlabel("Age")
plt.ylabel("Count")
plt.show()
plt.xlabel("Age")
plt.ylabel("Fare")
plt.title("Age vs Fare")
plt.show()
Week 4:
Write a Python program to implement Simple Linear
Regression and plot the graph.
Linear Regression: Linear regression is defined as an algorithm that provides a linear relationship between
an independent variable and a dependent variable to predict the outcome of future events. It is a statistical
method used in data science and machine learning for predictive analysis. Linear regression is a supervised
learning algorithm that simulates a mathematical relationship between variables and makes predictions for
continuous or numeric variables such as sales, salary, age, product price, etc.