0% found this document useful (0 votes)

30 views

Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head

The document discusses code to implement various machine learning algorithms using a titanic dataset. It includes code to: 1) Load and explore the dataset using pandas functions like head(), tail(), and info(). It also creates a heatmap and pairplot. 2) Implement linear and logistic regression models for classification and regression tasks. It evaluates the models and plots the results. 3) Build a naive Bayes classifier and computes the accuracy on test data. 4) Apply k-nearest neighbors and support vector machine algorithms for classification and evaluates the results. 5) Use decision tree and random forest classifiers and evaluates their performance on test data.

Uploaded by

Saloni Tuli

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head

Uploaded by

Saloni Tuli

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Ques 1: Write a program to Extract the data from the database using python.

Use
head(), tail(), info() commands in the imported data. Create a heat matrix and pairplot
for the imported data base.
Code:
import pandas as pd
df = pd.read_csv("titanic_train.csv")
df.head()
output:

df.tail()

df.info()

1
Heat matrix
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv("titanic_train.csv")
# Exclude non-numeric columns
numeric_columns = df.select_dtypes(include=['number']).columns
numeric_df = df[numeric_columns]
# Create a heatmap
plt.figure(figsize=(12, 8))
heatmap_data = numeric_df.corr()
sns.heatmap(heatmap_data, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Heatmap for Titanic Dataset')
plt.show()
output

2
pair plots
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.read_csv("titanic_train.csv")

# Filling missing values in the 'Age' column with the mean value
df['Age'].fillna(df['Age'].mean(), inplace=True)

# Create a pairplot
sns.pairplot(df, hue='Survived', markers=["o", "s"])

# Show the plot

plt.show()

3
output

4
Ques 2: Write a program to implement linear and logistic regression.
Code:
Linear regression
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.impute import SimpleImputer
import numpy as np

df = pd.read_csv("titanic_train.csv")
# Handling missing values in the 'Age' column using SimpleImputer
imputer = SimpleImputer(strategy='mean')
df['Age'] = imputer.fit_transform(df[['Age']])
# Selecting the features and target variable
X = df[['Age']].values
y = df['Fare'].values
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Creating a linear regression model
model = LinearRegression()
# Training the model
model.fit(X_train, y_train)
# Making predictions on the test set
y_pred = model.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')
# Predicting Fare for a new Age

5
new_age = np.array([[25]]) # Replace 25 with the desired age
predicted_fare = model.predict(new_age)
print(f'Predicted Fare for Age {new_age[0, 0]}: {predicted_fare[0]}')
# Plotting the linear regression line
plt.scatter(X_test, y_test, color='blue', label='Actual Fare')
plt.plot(X_test, y_pred, color='red', linewidth=3, label='Linear Regression Line')
plt.scatter(new_age, predicted_fare, color='green', marker='*', s=200, label=f'Predicted Fare
for Age {new_age[0, 0]}')
plt.title('Linear Regression Model')
plt.xlabel('Age')
plt.ylabel('Fare')
plt.legend()
plt.show()
output:

6
Logistic Regression
Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, f1_score, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

# Read the CSV data

df = pd.read_csv("titanic_train.csv")

# Drop columns that are not needed for modeling

df = df.drop(['PassengerId', 'Name', 'Ticket', 'Cabin'], axis=1)

# Convert categorical variables to numerical

df['Sex'] = df['Sex'].map({'male': 0, 'female': 1})
df['Embarked'] = df['Embarked'].map({'S': 0, 'C': 1, 'Q': 2})

# Fill missing values in 'Age' with the median

df['Age'].fillna(df['Age'].median(), inplace=True)

# Fill missing values in 'Embarked' with the most common value

df['Embarked'].fillna(df['Embarked'].mode()[0], inplace=True)

# Split the data into features (X) and target variable (y)
X = df.drop('Survived', axis=1)
y = df['Survived']

# Split the data into training and testing sets

7
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the logistic regression model

model = LogisticRegression()
model.fit(X_train, y_train)

# Make predictions on the test set

y_pred = model.predict(X_test)

# Calculate evaluation metrics

accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)

# Print the metrics

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"F1 Score: {f1:.2f}")
print(f"Confusion Matrix:\n{conf_matrix}")

# Plot the confusion matrix

plt.figure(figsize=(6, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', cbar=False,
xticklabels=['Not Survived', 'Survived'],
yticklabels=['Not Survived', 'Survived'])
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()

8
output:

9
Ques 3: Write a program to implement the naïve Bayesian classifier for a sample
training data set stored as a CSV file. Compute the accuracy of the classifier,
considering few test data sets.
Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

df = pd.read_csv("titanic_train.csv")

# Preprocess the data

df = df[['Survived', 'Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked']]
df['Sex'] = df['Sex'].map({'male': 0, 'female': 1})
df['Embarked'] = df['Embarked'].map({'S': 0, 'C': 1, 'Q': 2})
df['Age'].fillna(df['Age'].median(), inplace=True)
df['Embarked'].fillna(df['Embarked'].mode()[0], inplace=True)

# Split the data into features and target

X = df.drop('Survived', axis=1)
y = df['Survived']

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the Naive Bayes classifier

nb_classifier = GaussianNB()
nb_classifier.fit(X_train, y_train)

# Make predictions on the test set

y_pred = nb_classifier.predict(X_test)

10
# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print(f'Confusion Matrix:\n{conf_matrix}')
print(f'Classification Report:\n{classification_rep}')

output:

11
Ques 4: Write a program to implement k-nearest neighbors (KNN) and Support Vector
Machine (SVM) Algorithm for classification.
Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report

# Load the kyphosis dataset

df = pd.read_csv("kyphosis.csv")

# Display the first few rows of the dataset

print("Dataset Preview:")
print(df.head())

# Separate features (X) and target variable (y)

X = df.drop("Kyphosis", axis=1)
y = df["Kyphosis"]

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features (important for SVM)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# K-nearest neighbors (KNN) algorithm

knn_model = KNeighborsClassifier(n_neighbors=3)

12
knn_model.fit(X_train, y_train)

# Predictions using KNN

knn_predictions = knn_model.predict(X_test)

# SVM algorithm
svm_model = SVC(kernel='linear')
svm_model.fit(X_train_scaled, y_train)

# Predictions using SVM

svm_predictions = svm_model.predict(X_test_scaled)

# Evaluate the models

print("\nKNN Accuracy:", accuracy_score(y_test, knn_predictions))
print("\nClassification Report for KNN:")
print(classification_report(y_test, knn_predictions))

print("\nSVM Accuracy:", accuracy_score(y_test, svm_predictions))

print("\nClassification Report for SVM:")
print(classification_report(y_test, svm_predictions))

13
output:

14
Ques 5: Implement classification of a given dataset using random forest and decision
tree.
Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

df = pd.read_csv("kyphosis.csv")

# Display the first few rows of the dataset

print(df.head())

# Split the data into features (X) and target variable (y)
X = df.drop('Kyphosis', axis=1)
y = df['Kyphosis']

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Decision Tree Classifier

dt_classifier = DecisionTreeClassifier(random_state=42)
dt_classifier.fit(X_train, y_train)

# Predictions on the test set

dt_predictions = dt_classifier.predict(X_test)

# Evaluate Decision Tree

print("\nDecision Tree Classifier:")
print("Accuracy:", accuracy_score(y_test, dt_predictions))

15
print("Classification Report:")
print(classification_report(y_test, dt_predictions))

# Random Forest Classifier

rf_classifier = RandomForestClassifier(random_state=42)
rf_classifier.fit(X_train, y_train)

# Predictions on the test set

rf_predictions = rf_classifier.predict(X_test)

# Evaluate Random Forest

print("\nRandom Forest Classifier:")
print("Accuracy:", accuracy_score(y_test, rf_predictions))
print("Classification Report:")
print(classification_report(y_test, rf_predictions))

16
output:

17
Ques 6: Build an Artificial Neural Network (ANN) by implementing the Back
propagation algorithm and test the same using appropriate data sets.
Code:
import numpy as np

# Sigmoid activation function and its derivative

def sigmoid(x, derivative=False):
if derivative:
return x * (1 - x)
return 1 / (1 + np.exp(-x))

# Input data for XOR problem

X = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])

# Target labels for XOR

y = np.array([[0],
[1],
[1],
[0]])

# Set random seed for reproducibility

np.random.seed(42)

# Neural Network architecture

input_layer_size = 2
hidden_layer_size = 4
output_layer_size = 1

18
# Initialize weights and biases
weights_input_hidden = 2 * np.random.random((input_layer_size, hidden_layer_size)) - 1
weights_hidden_output = 2 * np.random.random((hidden_layer_size, output_layer_size)) - 1

# Training parameters
learning_rate = 0.5
epochs = 10000

# Training the Neural Network using backpropagation

for epoch in range(epochs):
# Forward pass
hidden_layer_input = np.dot(X, weights_input_hidden)
hidden_layer_output = sigmoid(hidden_layer_input)

output_layer_input = np.dot(hidden_layer_output, weights_hidden_output)

predicted_output = sigmoid(output_layer_input)

# Calculate the error

error = y - predicted_output

# Backpropagation
output_error_term = error * sigmoid(predicted_output, derivative=True)
hidden_error = output_error_term.dot(weights_hidden_output.T)
hidden_error_term = hidden_error * sigmoid(hidden_layer_output, derivative=True)

# Update weights
weights_hidden_output += hidden_layer_output.T.dot(output_error_term) * learning_rate
weights_input_hidden += X.T.dot(hidden_error_term) * learning_rate

# Test the trained Neural Network

19
test_data = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])

predicted_output_test =
sigmoid(sigmoid(test_data.dot(weights_input_hidden)).dot(weights_hidden_output))

print("Predicted Output after Training:")

print(predicted_output_test)
output:

Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
23BCE7092_ML_Lab_Assignment[1]
No ratings yet
23BCE7092_ML_Lab_Assignment[1]
14 pages
ML With Python Practical
No ratings yet
ML With Python Practical
22 pages
SHASHANK ML.docx
No ratings yet
SHASHANK ML.docx
23 pages
Home Work
No ratings yet
Home Work
12 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
8 pages
ML - LAB - FILE Amrit
No ratings yet
ML - LAB - FILE Amrit
13 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
Udacity Machine Learning Analysis Supervised Learning
100% (1)
Udacity Machine Learning Analysis Supervised Learning
504 pages
AIML PRACTICALS
No ratings yet
AIML PRACTICALS
22 pages
23BCE7199 ML Lab Assignment[1]
No ratings yet
23BCE7199 ML Lab Assignment[1]
15 pages
Machine
100% (1)
Machine
45 pages
ML - LAB - FILE Pankaj
No ratings yet
ML - LAB - FILE Pankaj
13 pages
21CSC305P Ml - Lab Programs 1 -9
No ratings yet
21CSC305P Ml - Lab Programs 1 -9
36 pages
Train
No ratings yet
Train
17 pages
ML File
No ratings yet
ML File
17 pages
ML_Prac1-10
No ratings yet
ML_Prac1-10
32 pages
ML File
No ratings yet
ML File
10 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
ML MANUAL WITH OUTPUTS (2)
No ratings yet
ML MANUAL WITH OUTPUTS (2)
30 pages
DA_012307
No ratings yet
DA_012307
8 pages
Additional Program
No ratings yet
Additional Program
573 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
ML Lab Programs (1)
No ratings yet
ML Lab Programs (1)
9 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
ML EXTERNAL XEROX
No ratings yet
ML EXTERNAL XEROX
1 page
Ml Lab Manual
No ratings yet
Ml Lab Manual
36 pages
AI ML - Cycle 2 Programs (1)
No ratings yet
AI ML - Cycle 2 Programs (1)
15 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Karmbir 19 ML
No ratings yet
Karmbir 19 ML
20 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
Data analytics
No ratings yet
Data analytics
10 pages
ADS_phase 3
No ratings yet
ADS_phase 3
34 pages
Machine Learnin
100% (2)
Machine Learnin
23 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
ML_Lab_01999676272
No ratings yet
ML_Lab_01999676272
12 pages
ML Activity Kalyan
No ratings yet
ML Activity Kalyan
21 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
Data Science Machine Leraning222
No ratings yet
Data Science Machine Leraning222
11 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
Machine Learning: Engr. Ejaz Ahmad
No ratings yet
Machine Learning: Engr. Ejaz Ahmad
54 pages
Unit 2
No ratings yet
Unit 2
5 pages
ML Codes
No ratings yet
ML Codes
9 pages
ML File External File
No ratings yet
ML File External File
25 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Machine File
No ratings yet
Machine File
27 pages
Final ML File
No ratings yet
Final ML File
34 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
Linearregression SVM
No ratings yet
Linearregression SVM
3 pages
CCD.ipynb - Colab
No ratings yet
CCD.ipynb - Colab
6 pages
22K61A0654_2_sasi_auto
No ratings yet
22K61A0654_2_sasi_auto
24 pages
ML Assignment
No ratings yet
ML Assignment
34 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet

Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head

Uploaded by

Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head

Uploaded by

Ques 1: Write a program to Extract the data from the database using python.

# Show the plot

# Read the CSV data

# Drop columns that are not needed for modeling

# Convert categorical variables to numerical

# Fill missing values in 'Age' with the median

# Fill missing values in 'Embarked' with the most common value

# Split the data into training and testing sets

# Create and train the logistic regression model

# Make predictions on the test set

# Calculate evaluation metrics

# Print the metrics

# Plot the confusion matrix

# Preprocess the data

# Split the data into features and target

# Split the data into training and testing sets

# Create and train the Naive Bayes classifier

# Make predictions on the test set

# Load the kyphosis dataset

# Display the first few rows of the dataset

# Separate features (X) and target variable (y)

# Split the dataset into training and testing sets

# Standardize the features (important for SVM)

# K-nearest neighbors (KNN) algorithm

# Predictions using KNN

# Predictions using SVM

# Evaluate the models

print("\nSVM Accuracy:", accuracy_score(y_test, svm_predictions))

# Display the first few rows of the dataset

# Split the data into training and testing sets

# Decision Tree Classifier

# Predictions on the test set

# Evaluate Decision Tree

# Random Forest Classifier

# Predictions on the test set

# Evaluate Random Forest

# Sigmoid activation function and its derivative

# Input data for XOR problem

# Target labels for XOR

# Set random seed for reproducibility

# Neural Network architecture

# Training the Neural Network using backpropagation

output_layer_input = np.dot(hidden_layer_output, weights_hidden_output)

# Calculate the error

# Test the trained Neural Network

print("Predicted Output after Training:")

You might also like