0% found this document useful (0 votes)
11 views

To Improve The Performance of Models Predicting Ba

How to improve the Battery performance

Uploaded by

Balram Choudhary
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

To Improve The Performance of Models Predicting Ba

How to improve the Battery performance

Uploaded by

Balram Choudhary
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

To improve the performance of models predicting battery temperature, we enhanced our approach

using the following steps:

Import Libraries and Load Data:

We start by importing necessary libraries, loading the data, and performing initial preprocessing:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, learning_curve, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Dense, Dropout
import xgboost as xgb
from sklearn.neighbors import KNeighborsRegressor
from sklearn.ensemble import RandomForestRegressor

battery_data = pd.read_csv('00041.csv')
data = battery_data[['Time', 'Voltage_measured', 'Current_measured', 'Temperature_measured',
'Current_load', 'Voltage_load']]
data['Time'] = pd.to_datetime(data['Time'])
data.set_index('Time', inplace=True)

sns.lineplot(data=data[['Voltage_measured', 'Current_measured', 'Temperature_measured',


'Current_load', 'Voltage_load']])
plt.show()

Train-Test Split and Normalization:

We split the data into training and test sets and normalize it:

train, test = train_test_split(data, test_size=0.2, shuffle=False)


scaler = StandardScaler()
train_scaled = scaler.fit_transform(train)
test_scaled = scaler.transform(test)
train_scaled = pd.DataFrame(train_scaled, index=train.index, columns=train.columns)
test_scaled = pd.DataFrame(test_scaled, index=test.index, columns=test.columns)

Create Sequences:
For LSTM and GRU models, we create sequences of data points:

def create_sequences(data, seq_length):


xs, ys = [], []
for i in range(len(data)-seq_length):
x = data.iloc[i:(i+seq_length)].values
y = data.iloc[i+seq_length]['Temperature_measured']
xs.append(x)
ys.append(y)
return np.array(xs), np.array(ys)

SEQ_LENGTH = 10
X_train, y_train = create_sequences(train_scaled, SEQ_LENGTH)
X_test, y_test = create_sequences(test_scaled, SEQ_LENGTH)

X_train_flat = train_scaled.iloc[:-SEQ_LENGTH].values
y_train_flat = y_train
X_test_flat = test_scaled.iloc[:-SEQ_LENGTH].values
y_test_flat = y_test

Data Augmentation:
We use bootstrapping to augment the training data:

def bootstrap_data(X, y, n_samples=1000):


indices = np.random.randint(0, len(X), size=n_samples)
return X[indices], y[indices]

X_train_aug, y_train_aug = bootstrap_data(X_train_flat, y_train_flat)

Define Models:
We define the deep LSTM and GRU networks and set up hyperparameter tuning for XGBoost:

def create_lstm_model(input_shape):
model = Sequential([
LSTM(100, activation='relu', return_sequences=True, input_shape=input_shape),
Dropout(0.2),
LSTM(50, activation='relu'),
Dropout(0.2),
Dense(1)
])
model.compile(optimizer='adam', loss='mse')
return model

def create_gru_model(input_shape):
model = Sequential([
GRU(100, activation='relu', return_sequences=True, input_shape=input_shape),
Dropout(0.2),
GRU(50, activation='relu'),
Dropout(0.2),
Dense(1)
])
model.compile(optimizer='adam', loss='mse')
return model

input_shape = (SEQ_LENGTH, train_scaled.shape[1])

model_lstm = create_lstm_model(input_shape)
model_gru = create_gru_model(input_shape)

param_grid = {
'n_estimators': [50, 100, 200],
'max_depth': [3, 5, 7],
'learning_rate': [0.01, 0.05, 0.1],
'subsample': [0.8, 0.9, 1.0]
}

xgb_model = xgb.XGBRegressor(objective='reg:squarederror')
grid_search = GridSearchCV(estimator=xgb_model, param_grid=param_grid, cv=3,
scoring='neg_mean_squared_error', verbose=1, n_jobs=-1)
grid_search.fit(X_train_aug, y_train_aug)
best_xgb_model = grid_search.best_estimator_

Cross-Validation for KNN and Random Forest:


We perform cross-validation for the KNN and Random Forest models:

from sklearn.model_selection import cross_val_score

model_knn = KNeighborsRegressor(n_neighbors=5)
knn_scores = cross_val_score(model_knn, X_train_aug, y_train_aug, cv=5,
scoring='neg_mean_squared_error')
print("KNN Cross-Validation Scores:", knn_scores)

model_rf = RandomForestRegressor(n_estimators=100)
rf_scores = cross_val_score(model_rf, X_train_aug, y_train_aug, cv=5,
scoring='neg_mean_squared_error')
print("Random Forest Cross-Validation Scores:", rf_scores)

Train Models:
We train the LSTM, GRU, XGBoost, KNN, and Random Forest models:

history_lstm = model_lstm.fit(X_train, y_train, epochs=20, validation_data=(X_test, y_test))


history_gru = model_gru.fit(X_train, y_train, epochs=20, validation_data=(X_test, y_test))

eval_set = [(X_train_aug, y_train_aug), (X_test_flat, y_test_flat)]


best_xgb_model.fit(X_train_aug, y_train_aug, eval_metric="rmse", eval_set=eval_set, verbose=False)

model_knn.fit(X_train_aug, y_train_aug)
model_rf.fit(X_train_aug, y_train_aug)

Make Predictions:
We make predictions using the trained models:

y_pred_lstm = model_lstm.predict(X_test)
y_pred_gru = model_gru.predict(X_test)
y_pred_xgb = best_xgb_model.predict(X_test_flat)
y_pred_knn = model_knn.predict(X_test_flat)
y_pred_rf = model_rf.predict(X_test_flat)

y_pred_combined = (y_pred_lstm.flatten() + y_pred_gru.flatten() + y_pred_xgb + y_pred_knn +


y_pred_rf) / 5

Visualize Results:
We prepare the results for visualization and plot them:

results = pd.DataFrame({
'Actual': y_test,
'LSTM': y_pred_lstm.flatten(),
'GRU': y_pred_gru.flatten(),
'XGBoost': y_pred_xgb,
'KNN': y_pred_knn,
'RF': y_pred_rf,
'Combined': y_pred_combined
}, index=test.index[SEQ_LENGTH:])

plt.figure(figsize=(14, 8))
sns.lineplot(data=results, markers=True)
plt.title('Battery Temperature Prediction')
plt.xlabel('Time')
plt.ylabel('Temperature')
plt.legend()
plt.show()

def plot_learning_curve(history, model_name):


plt.figure(figsize=(14, 8))
plt.plot(history.history['loss'], 'o-', label='Training Loss')
plt.plot(history.history['val_loss'], 'o-', label='Validation Loss')
plt.title(f'Learning Curve - {model_name}')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

plot_learning_curve(history_lstm, 'LSTM')
plot_learning_curve(history_gru, 'GRU')

Evaluate Metrics:
We calculate the performance metrics for each model:

def calculate_metrics(y_true, y_pred):


mae = mean_absolute_error(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_true, y_pred)
return mae, mse, rmse, r2

metrics_lstm = calculate_metrics(y_test, y_pred_lstm.flatten())


metrics_gru = calculate_metrics(y_test, y_pred_gru.flatten())
metrics_xgb = calculate_metrics(y_test, y_pred_xgb)
metrics_knn = calculate_metrics(y_test, y_pred_knn)
metrics_rf = calculate_metrics(y_test, y_pred_rf)
metrics_combined = calculate_metrics(y_test, y_pred_combined)

comparison_table = pd.DataFrame({
'Model': ['LSTM', 'GRU', 'XGBoost', 'KNN', 'Random Forest', 'Combined'],
'MAE': [metrics_lstm[0], metrics_gru[0], metrics_xgb[0], metrics_knn[0], metrics_rf[0],
metrics_combined[0]],
'MSE': [metrics_lstm[1], metrics_gru[1], metrics_xgb[1], metrics_knn[1], metrics_rf[1],
metrics_combined[1]],
'RMSE': [metrics_lstm[2], metrics_gru[2], metrics_xgb[2], metrics_knn[2], metrics_rf[2],
metrics_combined[2]],
'R2': [metrics_lstm[3], metrics_gru[3], metrics_xgb[3], metrics_knn[3], metrics_rf[3],
metrics_combined[3]]
})

print(comparison_table)

Plot Validation Curves:


We plot the internal and external validation curves for each model:

def plot_internal_validation_curve(model, X, y, model_name):


train_sizes, train_scores, val_scores = learning_curve(model, X, y, cv=5,
scoring='neg_mean_squared_error', n_jobs=-1)
train_scores_mean = -train_scores.mean(axis=1)
val_scores_mean = -val_scores.mean(axis=1)

plt.figure(figsize=(14, 8))
plt.plot(train_sizes, train_scores_mean, 'o-', label='Training Error')
plt.plot(train_sizes, val_scores_mean, 'o-', label='Validation Error')
plt.title(f'Internal Validation Curve - {model_name}')
plt.xlabel('Training Set Size')
plt.ylabel('Error (MSE)')
plt.legend()
plt.show()

def plot_external_validation_curve(model, X, y, X_test, y_test, model_name):


train_sizes = np.linspace(0.1, 0.9, 9)
train_scores = []
val_scores = []

for train_size in train_sizes:


X_train_part, _, y_train_part, _ = train_test_split(X, y, train_size=train_size, random_state=42)
model.fit(X_train_part, y_train_part)
train_pred = model.predict(X_train_part)
val_pred = model.predict(X_test)

train_scores.append(mean_squared_error(y_train_part, train_pred))
val_scores.append(mean_squared_error(y_test, val_pred))

plt.figure(figsize=(14, 8))
plt.plot(train_sizes, train_scores, 'o-', label='Training Error')
plt.plot(train_sizes, val_scores, 'o-', label='Validation Error')
plt.title(f'External Validation Curve - {model_name}')
plt.xlabel('Training Set Size')
plt.ylabel('Error (MSE)')
plt.legend()
plt.show()

plot_internal_validation_curve(best_xgb_model, X_train_aug, y_train_aug, 'XGBoost')


plot_external_validation_curve(best_xgb_model, X_train_aug, y_train_aug, X_test_flat, y_test_flat,
'XGBoost')

plot_internal_validation_curve(model_knn, X_train_aug, y_train_aug, 'KNN')


plot_external_validation_curve(model_knn, X_train_aug, y_train_aug, X_test_flat, y_test_flat, 'KNN')

plot_internal_validation_curve(model_rf, X_train_aug, y_train_aug, 'Random Forest')


plot_external_validation_curve(model_rf, X_train_aug, y_train_aug, X_test_flat, y_test_flat, 'Random
Forest')

Results Summary:
After implementing the above steps, we achieved the following performance metrics:

 GRU model outperformed others with the lowest MAE, MSE, and RMSE, and a positive R2 score.
 LSTM also performed well, although not as well as GRU.
 Combined predictions showed moderate performance.
 XGBoost, KNN, and Random Forest models did not perform as well as the deep learning models.
Model MAE MSE RMSE R2
LSTM 1.109204 1.819550 1.348907 -2.866460
GRU 0.303446 0.111909 0.334528 0.762198
XGBoost 1.990743 4.434215 2.105757 -8.422503
KNN 2.481018 6.893967 2.625637 -13.649365
Random 1.832857 3.829704 1.956963 -7.137946
Forest
Combined 1.543454 2.742196 1.655958 -4.827040

You might also like