phase3.3
phase3.3
Reduc on
Phase 3: Model Training and Evaluation
3.1 Overview of Model Training and Evaluation
In this phase, we focus on selecting suitable algorithms, training the models using processed
customer journey data, and evaluating their performance. The goal is to identify distinct
customer behavior patterns and enhance user experience by optimizing touchpoints along
their journey. Principal Component Analysis (PCA) is used for dimensionality reduction,
followed by K-Means clustering for segmentation. Various evaluation metrics are employed
to assess clustering effectiveness, ensuring robust model performance.
optimal_clusters = sil_scores.index(max(sil_scores)) + 2
silhoue e_scores = []
for train_idx, test_idx in kf.split(pca_df):
X_train, X_test = pca_df.iloc[train_idx], pca_df.iloc[test_idx]
kmeans = KMeans(n_clusters=best_k, random_state=42)
kmeans.fit(X_train)
clusters_pred = kmeans.predict(X_test)
Source code :
# Sca er Plot with Centroids
plt.figure(figsize=(14, 10))
sns.sca erplot(x=latent_features[:, 0], y=latent_features[:, 1], hue=clusters, pale e='viridis',
s=100, alpha=0.7, edgecolor='w', linewidth=0.6)
plt.sca er(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=400, c='red',
marker='X', edgecolor='black', linewidth=1.5, label='Centroids')
plt. tle('Customer Journey Clusters with Centroids', fontsize=18, weight='bold')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()