Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
48 views
Capstone Project Report 2
Uploaded by
Saumya Singh
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save Capstone+Project+Report+2 For Later
Download
Save
Save Capstone+Project+Report+2 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
48 views
Capstone Project Report 2
Uploaded by
Saumya Singh
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save Capstone+Project+Report+2 For Later
Carousel Previous
Carousel Next
Download
Save
Save Capstone+Project+Report+2 For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 178
Search
Fullscreen
3116124, 8:08 PM capstone projectreport, HOUSE PRICE PREDICTION Structure: 1. Model Building with Dataset-1 2. Hypertuning Dataset-1 3. Summary - Dataset-1 4, Model Building with Dataset-2 5. Hypertuning Dataset-2 6. Summary - Dataset -2 7. Conclusion Note: Dataset - 1 = 22 features [price’, ‘room_bed’, room_bath’, living_measure’, ot_measure’, ‘ceil’, ‘coast’, ‘sight, ‘condition’, quality, ‘ceil_measure’ r_built, tiving_measure15,, lol_measure15,, furnished’, ‘otal_area’, ‘month_year, ‘city’, has_basement,, ‘HouseLandRatio’, ‘has_renovated'] Dataset - 2 = 31 features (important features after imputing dummy and analyzing different models) [price’, room_bed, ‘room_bath, tiving_measure’, ot_measure’, ‘ceil, ‘sight, ‘condition’, ‘ceil_measure’, ‘basement, 'yr_buitt, 'yr_renovated’, ‘zipcode’, at’ ong’, tiving_measure15’, tot_measure15, total_area’, ‘coast_1, ‘quality 3’ ‘quality 4, ‘quality_5’ ‘quality_6; ‘quality_7, quality_8', quality 9°, ‘quality 10, ‘quality_11’ ‘quality_12', ‘quality_13", furnished_1" Model building Let's build the model and see their performances Linear Regression (with Ridge and Lasso) In [135]: | #importing the necessary Libraries from sklearn.linear_model import LinearRegression from sklearn.linear_model import Ridge from sklearn.linear_model import Lasso from sklearn import metrics from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error {ie:1IC:1Users/SAUMYA SINGH/DocumenisiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Mtn we3116124, 8:08 PM In [136]: capstone projectreport, LR1 = LinearRegression() LR1.#it(X_train, y_train) tpredicting result over test data y_LR1_predtr= LRi.predict (x_train) y_LR1_predvl= LRi.predict (x_val) LR1.coef_ out[136]: array([ 4.65340600e+01, -2,04412513e+03, 1.71493571e+02, -2,49409439e+04, -6,13484472e+04, 1.25982999e+04, 9.03633165e+04, 1.12536191e+05, 1.64433901¢+05, 1.74737226e+05, -7.18864612e-09, 5.71678128e+04, 4,57278795e+04, 1.30265390e+05, -1.50728574e105, -3.31240159¢+05,, 4,729711260+05, 1.76625819e+85, -1,34952467e+04, 2.33952638e+05, 1.32976614e+85, 1.73301885e+85, 2.80a14383e+04, -2.60438290e+01, 5.35189660e+01, -1.52366392e+04, -3. 384286500104, -5.01063903e+04, 8.20853745e+04, 1.00072705e+05, 1.13322504e+05, 2.85527951e+05, 9.19752797e+05, 1.64060215e+04, 7.56908538e+04, 1.17418144e405, 1.58016139e+05, -1.30295440e105, -2.05150668e+05, 2.94909682e+05, 1.09165616e+05, 1.66521627¢+@5, 4,074064480+04, 2.32526660e+05, 1.04083333e+05, 3.11287808e+04]) 4.15085952e+01, -1,85866571e+00, -1.71559917e+02, -7.05629386e+04, -1.54205825e+05, 9.07899656e+04, 1.08362140e+05, 1.30547366e105, 1.71382012e+05, 1,55294652e105, 1.53741629e+04, 2.60659481e+05, 2.55845105e+05, 1.95914534e+05, ~5.38413267e104, 3,46837714e104, 1,32458120e105, 1.81816493e104, 1.21199592¢+05, 8.02211265¢+05, 6.01053338¢+04, 8.65668543e+04, {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 5.02547332e+00, 2.04902213e+01, -7,41190980e+03, -1.39570745e105, -2, 246568300105, 2.27686059e+05, 1.12686599e+05, 1.77676262e+05, 1,61500051e+05, 2.95336027e+05, 5.55177883e+04, 4,01215839e+04, 9.85063804e+04, -1.522253010+05, 1.41195146e+04, 9.74678182e+05, 1.23710826e+05, 1.72163167e+05, 1.50478273¢+04, 4.196843840+05, 1.61179525e+05, 1.56470485e+25, 2783716724, 8:06 PM In [137]: out [137]: capstone projectreport, f#odel score and Deduction for each Model in a DataFrame LR1_trscore=r2_score(y_train,y_LR1_predtr) LR1_trRMSE=np.sqrt(mean_squared_error(y train, y_LR1_predtr)) LR1_teMSE=mean_squared_error(y train, y_LRi_predtr) LR1_teMAE=mean_absolute_error(y train, y_LRi_predtr) LR1_viscore=r2_score(y_val,y_LR1_predvi) LR1_vIRMSE=np.sqrt(mean_squared_error(y val, y_LR1_predvl)) LR1_vINSE-mean_squared_error(y val, y_LR1_predvl) LR1_v1MAE=mean_absolute_error(y_val, y_LR1_predvl) Compa_df=pd.DataFrame({'Method' :['Linear Reg Model1'], "Val Score’ :LR1_vlscor e,'RMSE_v1": LR1_vIRMSE, 'MSE_v1': LR1_vIMSE, 'MAE_vi': LR1_vIMAE, ‘train Scor e':LR1_trscore,'RMSE_tr': LR1_trRMSE, 'MSE_tr': LR1_trMSE, ‘MAE tr’: LR1_trMA E}) #Compa_df = Compa_df[['Method’, ‘Test Score’, 'RMSE', 'MSE', ‘MAE’]] Compa_d# Method Nal RMSE_vI MSE_vi waz tain RMSE_t od Score Ml Ml “Score tr Linear © Reg 0.718749 137733.698415 1.897057e+10 93994.455301 0.790112 132958.367261 1.7677 Modett The linear regression model performed with scores 0.73 & .72 in training data set and validation data set respectively {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm size3116124, 8:08 PM capstone projectreport, In [138]: sns.set(style="darkgrid", color_codes-True) with sns.axes_style(“white"): sns.jointplot(x=y_val, _LR1_predvl, kind="reg", colo’ 2000000 ‘800000 ‘000000 00000 ‘© 500000 1090000 1500000 2000000 2500000 rice Lasso model {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 4n783116124, 8:08 PM In [139]: Lasso1 = Lasso(alpha=1) capstone projectreport, Lasso1.fit(X_train, y_train) apredicting result over test data y_Lassol_predtr= Lasso1.predict(X_train) y_Lassol_predvi= Lasso1. predict (X_val) Lassol.coef_ cause precision problens. ConvergenceWarning) out[139}: array([ 9.65902931e+01, -2.04163675e+03, 1.73465396e+02, 1.64885732e+04, -1.65210854e+04, -9,13241618e+04, -1.64860326e+04, 5.60351070e+03, 5.73457320e+04, 6.65775444e+04, 2.20000000e+00, 5. 707683440404, 4,57348433¢104,, 1.25980055e+05, =2.00695519e+05,, 5.98641869e+04, 3.18911457e+04, 1.75754198e+85, -1,38642363e+04, 2.33538555e+05, 1.32466072e+85, 1.72962017e+85, 2.8aa9saade+ad, -3.93953142e+00, 5.35553554e+01, 2.58275091e+04, 7.61896026€+03, -4,12121513e+03, -2.48925268e+04, -6.79525494e+03, 6.32472345e+03, 1.78131982e+05, 8.00797041e+05, 1.64275884e+04, 7.272843800+04, 1.17367493e105, 1.53729964e405, -1,80295562e+05, 1.85986501¢+05, 2.944658260+05, 1.0862595ee+05, 1.66073186¢+05, 4,03015196e+04, 2.32091780¢+05, 1.03839270e+05, 3.11881238e+04]) 1.35529095e101, -1.85807486e+00, 4.11967331e+04, -2.86912815e+04, -1.00992716e+05, -1,60169381e+04, 1.46220219e+03, 2.34670403e+04, 6.38573909e+04, 4,05485728e+04, 1.5346a259e+04, 2.60559411¢+05, 2.55764846e105, 1,91681487e+05, -1.03865243e105, 4,26066854e+05, 1.31572941¢405, 1. 76695808e+04, 1.20696485¢105, 8.01308223e+05, 5.96809785¢104, 8.54647044e+04, {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm : \ProgramData\Anaconda3\1ib\site-packages\sklearn\linear_model\coordinate_de scent.py:492: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Fitting data with very small alpha may -2,29061776e+01, -1.60949858e+00, 3.40052776e+04, -9.69275851e+04, -1.71298527e+05, 1.18516529e+05, 5.75895038e+03, 7.06751057e+04, 5.27207711e+04, 1,84114085e+05, 5.53276759e+04, 3.99774418¢+04, 9.44643505e+04, -1.980774200+05, -3.58821013e104, 1,35396787e+06, 1.231997460+05, 1.70928679e+05, 1.45879439e+04, 4,19176853e+05, 1.60725157e+05, 1.55943713e+25, 51783116124, 8:08 PM capstone projectreport, In [140]: #Model score and Deduction for each Model in a Datafrane Lassol_trscore=r2_score(y_train,y_Lassol_predtr) Lassol_trRMsE=np.sqrt(mean_squared_error(y train, y_Lassol_predtr)) Lassol_trusé=mean_squared_error(y train, y_Lassol_predtr) Lassol_trMAé=mean_absolute_error(y train, y_Lassol_predtr) Lassol_viscore=r2_score(y_val, y_Lasso1_predvl) Lasso1_v1RMSE=np.sqrt(mean_squared_error(y_val, y_Lasso1_predvl)) Lassol_vIMSE=mean_squared_error(y_val, y_Lassol_predvl) Lassol_vIMAE=mean_absolute_error(y_val, y_Lassoi_predvi) Lasso1_df=pd.DataFrame({'Method' :['Linear-Reg Lasso1'], ‘Val Score’ :Lasso1_visc ore, 'RNSE_vl': Lassol_vIRMSE, "MSE_v1': Lasso1_vINSE, 'MAE_vl': Lassoi_v1MA E,'train Score’ :Lassol_trscore,'RMSE_tr’: Lassol_trRMSE, 'MSE_tr': Lasso1_trMs E, 'MAE_tr': Lassol_trMAE}) Compa_d# = pd.concat([Conpa_df, Lassol_df]) Compa_dt out[140]: Val train Method val RMSE_vI MSE_vi mae tain RMSE_tr Linear © Reg 0.718749 137733.698415 1.897057e+10 99994.455301 0.730112 132958367261 1.7671 Model! Linear 0 Rog 0.719117 137643.639712 1.894577e+10 93939441186 0.730092 132963.180396 1.767! Lasso! ‘The lasso linear regression model performed with scores 0.73 & .72 in training data set and validation data set respectively. The coefficeints of 1 variable in lasso model is almost 0’, signifying that the variable with 'O" coefficient can be dropped. {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 5783116124, 8:08 PM capstone projectreport, In [141]: sns.set(style="darkgrid", color_codes-True) with sns.axes_style(“white"): sns.jointplot(x=y_val, | Lassol_predvl, kind="reg", colo 2000000 ‘800000 ‘000000 00000 ‘© 500000 1090000 1500000 2000000 2500000 rice Ridge model {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm mre3116124, 8:08 PM In [142]: Ridge1 = Ridge(alpha-o. Ridgel.fit(X_train, y_t apredicting result over y_Ridgel_predtr= Ridge. y_Ridgel_predvl= Ridge1. Ridget.coef_ out[142]: array([ 4.66834622e+01, -2.04329070e+03, 1.99149390e+02, 3.84394798e+04, 2.07847666e+03, -1.24763131e+05, -6,18022272e+04, -4,00644775e+04, 1.13660948e+04, 2.07235857e+04, 2.e0000020e+00, 5.68361232e+04, 4,584989300+04,, 1.12963518e+05, -1.40442366¢+05, -2.64929900+05, 4,12010220e+05, 1.69604765¢+05, -1.74612788¢+04, 2.29612234e+85, 1.28389461e+05, 1.69467429e+05, 2.85123132e+04, capstone projectreport, 5) rain) test data -predict(x_train) spredict (X_val) -2.60918244e+01, 5 .40305072e+01, 4.75922037¢+04, 2.95688617e+04, 1.15197429e+04, -6.99157975e+04, -5.21975799e+04, -3.93829165e+04, 1,31445711e+05, 5 .06032623e+05, 1.62595610e+04, 6.73075467e+04, 1.16570206e+05, 1.40760828e+05, -1.20195568e+05, -1,34329618e+05, 2.90368008¢+05, 1.04432882c+@5, 1.62055462e+@5, 3.67503018¢+04, 2.28151004e+05, 9.98907258e+04, 3.11686377e+04]) 4.15530007e+01, -1.83894732e+00, 6.34342640e+04, -6.95633001e+03, -6.08136975e+04, -6.12127626e+04, -4,39150711e+04, -2.22301737e+04, 1, 794g0665e+04, 1.96912226e+03, 1,52860502e+04, 2,58145115e+05, 2,54669644e+05, 1.78571191¢105, 4, 38816320e104, 1,11752710e105, 1. 26135853e105, 1.42752316¢+04, 1.16592568¢105, 7.71949916e+05, 5.60163760e104, 8.00564608e+04, {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 5.13138900e+00, 2.05899911e+01, 5.62050467e+04, -7.29503951e+04, -1.07506532e+05, 684637990104, -3.96547935e104, 2.63190006e+04, 7.08429226e+03, 1,26354803e+05, 5.48436035¢+04, 3.96081664e+04, 8.12334931¢+04, 1. 31486307e105, 2,39956530e+04, 6.95080119e+05, 1,18985675e+05, 1.63147681¢+05, 1.10585172e+04, 4.13369607e+05, 1.56620270e+25, 1.51440054e+05, a73116124, 8:08 PM capstone projectreport, In [143]: | #Model score and Deduction for each Model in a DataFrame Ridge1_trscore=r2_score(y_train,y Ridge1_predtr) Ridge1_trRMSE=np. sqrt (mean_squared_error(y train, y Ridge1_predtr)) Ridgel_trMSE-mean_squared_error(y train, y Ridgel_predtr) Ridgel_trMAE-mean_absolute_error(y train, y_Ridgel_predtr) Ridge1_viscore=r2_score(y_val,y_Ridgel_predv1) Ridge1_vIRMSE=np. sqrt (mean_squared_error(y_val, y_Ridge1_predvl)) Ridge1_vIMSE=mean_squared_error(y_val, y_Ridge1_predvl) Ridge1_vIMAE=mean_absolute_error(y_val, y_Ridgel_predv1) Ridge1_df=pd.DataFrame({ ‘Method’ :['Linear-Reg Ridge1'], 'Val Score’ :Ridget_visc ore, 'RNSE_vl': Ridgel_vIRMSE, "MSE_v1': Ridge1_vINSE, 'MAE_vl': Ridge1_v1MA E,'train Score’ :Ridgel_trscore,'RMSE_tr’: Ridgel_trRMSE, 'MSE_tr': Ridgel_trMs E, 'MAE_tr': Ridge1_trMAE)) Compa_d# = pd.concat([Compa_df, Ridge1_df]) Compa_dt out [143]: Val train Method val RMSE_vI MSE_vi mae tain RMSE_tr Linear 0 Reg 0.718749 137733.698415 1.897057e+10 99994.455301 0.730112 132958367261 1.7671 Model! Linear 0 Rog 0.719117 137643.639712 1.8945770+10 93939.441186 0.730092 132963.180396 1.767! Lasso! Linear- 0 Reg 0.718929 137689,597398 1.895843e+10 93992.809617 0.729789 133037.735155 1.769% Ridge? The Ridge linear regression model performed with scores 0.73 & .72 in training data set and validation data set respectively. The coefficeints of variables in ridge model are all non-zero, indicating that non of the variables can be dropped. {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm er7e3116124, 8:08 PM capstone projectreport, In [144]: sns.set(style="darkgrid", color_codes-True) with sns.axes_style(“white"): sns. jointplot(x=y_val, | Ridgel_predvl, kind="reg", color 2000000 ‘800000 ‘000000 00000 ‘© 500000 1090000 1500000 2000000 2500000 rice In summary, Linear models have performed almost with similar results in both regularized model and non- regularized models KNN Regressor In [145]: | from sklearn.neighbors import KNeighborsRegressor In [146]: knni = KNeighborsRegressor(n_neighbors=4,weights="distance') knn1.#it(X_train, y_train) fpredicting result over test data y_knn1_predtr= knn1.predict(X_train) y_knnd_predvl= knni. predict (Xx_val) {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sone3116124, 8:08 PM capstone projectreport, In [147]: | #Model score and Deduction for each Model in a DataFrame knn4_trscore=r2_score(y_train,y_knn1_predtr) knnd_trRMSE=np.sqrt(mean_squared_error(y train, y_knn1_predtr)) knn1_trMSE=mean_squared_error(y_train, y_knni_predtr) knn1_trMAg-mean_absolute_error(y train, y_knni_predtr) knn_viscore=r2_score(y_val,y_knnt_predvi) knn1_v1RMSE=np. sqrt(mean_squared_error(y_val, y_knn1_predvl)) knt_vIMSé-nean_squared_error(y val, y_knni_predvl) kemt_vIMA-nean_absolute_error (y_val, y_knni_predvi) knn1_df=pd.DataFrame({*Method' :['knn1'], "Val Score’ :knni_viscore, 'RMSE_v1": kn na_vIRMSE, 'MSE_v1': knni_vIMSE, 'MAE_vl': knn1_vIMAE, ‘train Score" :knni_trsco re, 'RMSE_tr': knn1_trRMSE, 'MSE_tr': knn1_trMSE, ‘MAE tr’: knnd_trMAE}) Compa_d# = pd.concat([Compa_df, knn1_dt]) conpa_df out[147}: Method X!RMSEM = MSE MAEM gS Sete Cinear 0 "Reg oriera 19773309815 19070sTeet0 99004455001 a7a0rt2 12058.967261 1.76 Model Liner. 0 Rey O:719117 ta7e4aea6712 1e4s7Tev10 sasseaertes 0.720092 192860180090 1.761 Lasso’ Lincer- 0 eg 0718929 137680557208 1s9se4ser10 99992808617 0.729789 199007725155 1.76% Rege! 0 knnt 0.425008 196935.451160 3.878357e+10 138494.983286 0.998628 9480.192071 8.981 ‘Though KNN regressor performed well in training set, the performance score in validation set is very less. This shows that the model is overfited in training set Support vector regressor In [148]: from sklearn.svm import SVR In [149]: SVRL = SVR(gamma='auto", Sve1.fit(X_train, y_train) 12.2, epsilor .2,kernel='rof') y_SVR1_predti y_SVR1_predv: SVR1.predict(x_train) SVR1.predict(X_val) {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm ane3116124, 8:08 PM capstone projectreport, In [150]: #Model score and Deduction for each Model in a Dataframe SvR1_trscore=r2_score(y_train,y SVR1_predtr) SVR1_trRMSE=np.sqrt(mean_squared_error(y train, y_SVR1_predtr)) SVR1_trMSE=mean_squared_error(y_train, y_SVR1_predtr) SVR1_trMAE-mean_absolute_error(y train, y_SVR1_predtr) SVRI_viscore=r2_score(y_val,y_SVR1_predv1) ‘SVR1_v1RMSE=np. sqrt(mean_squared_error(y_val, y_SVR1_predv1l)) SVR1_vIMSE-nean_squared_error(y val, y_SVRi_predvl) SVRI1_vIMAE-nean_absolute_error(y_val, y_SVRi_predvl) SvR1_df=pd.DataFrame({ "Method! :['SVR1'], "Val Score’ :SVR1_vIscore, "RMSE_vl": SV R1_VvIRMSE, 'MSE_vl': SVRI_VIMSE, 'MAE_vl': SVR1_VIMAE, ‘train Score’ :SVR1_trsco re, 'RMSE_tr': SVRI_trRMSE, 'MSE_tr': SVR1_trMSE, ‘MAE tr’: SVR1_trMAE}) Compa_d# = pd.concat([Compa_df, SvR1_dt]) Conpa_af out (150): Method ValScore -RMSE_v| = MSE_yl mac U3 RMSE Linear 0 Reg 0.718749 137735.698415 1.897057e+10 99904.455301 0.730112 132958367261 1.71 Model! Linear. 0 Reg 0719117 197643.690712 1e94577e+10 93099.441188 0.73002 132960.180395 1.71 Lasso! Linear. 0 "Reg 0.718829 197689.597308 1e95843er10 99992.809617 0.729709 199097.795155 1.71 Ridge 0 knnt 0.425008 196936.451160 3.8783570+10 198494.383286 0.998628 9480.192071 8.91 0 SVRt -0,055489 266820.956555 7.1193420+10 183639,593215 -0,046405 261802341726 6.8! The above negative scores in SVR model is due to non-learning of the model in the training set which results in non-performance in validation set {ie:1IC:1Users/SAUMYA SINGH/DocumenisiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Mtn rane3116124, 8:08 PM capstone projectreport, In [151]: SVR2 = SVR(ganma=" auto" ,C-0.1,kernel=" linear’) SvR2.fit(X_train, y_train) y_SVR2_predt y_SVR2_predv. ‘SVR2. predict (X_train) SVR2. predict (X_val) #fodel score and Deduction for each Model in a DataFrame SvR2_trscore-r2_score(y_train,y_SVR2_predtr) SVR2_trRMSE=np. sqrt(mean_squared_error(y train, y_SVR2_predtr)) SVR2_trMSE=mean_squared_error(y train, ySVR2_predtr) SVR2_trMAE=mean_absolute_error(y_train, y_SVR2_predtr) ‘SVR2_vlscore=r2_score(y_val,y_SVR2_predvl) ‘SVR2_VIRMSE=np. sqrt (mean_squared_error(y_val, y_SVR2_predvl)) ‘SVR2_vIMSE=mean_squared_error(y_val, y_SVR2_predvl) ‘SVR2_VIMAE=mean_absolute_error(y_val, y_SVR2_predvl) ‘SvR2_d#=pd.DataFrame({*Method' :['SVR2'], ‘Val Score’ :SVR2_vIscore, "RMSF_v1': SV R2_vIRMSE, 'MSE_v1': SVR2_VIMSE, 'MAE_vl': SVR2_VINAE, ‘train Score’ :SVR2_trsco re, "RMSE_tr': SVR2_trRMSE, 'MSE_tr': SVR2_trMSE, 'MAE_tr': SVR2_trMAE}) Compa_df = pd.concat([Compa_df, SvR2_df]) Conpa_df Out[151]: Method ValScore = RMSEMI MSE yi mac 32 RMSE Ar Linear 0 "Reg 0718749 13733.698615 1.897057e+10 93004.455301 0.730112 132958.367261 1.7 Model! Linear 0 "Reg O71eH17 137643.690712 1.804577e+10 93899.461186 0730002 132969.180306 1.7 Lassot near 0 "Reg 0.718929 137689.597308 1.89584se+10 93092.809617 0.720789. 139087.796155 1.7 Ridge’ 0 kant 0.425008 196935.451160 3.878357e+10 138494.383286 0.998628 9480.192071 ai SVR1 -0,055489 266820.956555 7.119342e+10 183639,593218 -0,046405 261802.341726 6.8! 0 SVR2 0.458252 191157.623415 3.654124e+10 132876663665 0.454410 189041408746 3.5 ‘The SVR model with modified parameters has not performed well with just ~0.45 in both training and validation data sets De n Tree Regressor In [152]: from sklearn.tree import DecisionTreeRegressor {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sane3116124, 8:08 PM In [153]: out [153]: capstone projectreport, DT1 = DecisiontreeRegressor() DT1.#it(X_train, y_train) y_DT1_predtr= DT1.predict(X_train) y_DT1_predvl= DT1.predict(X_val) #odel score and Deduction for each Model in a Dataframe DT1_trscore=r2_score(y_train,y_DT1_predtr) DT1_trRMSE=np. sqrt(mean_squared_error(y train, y DT1_predtr)) DT1_trMSE=mean_squared_error(y train, y_0T1_predtr) DT1_trWAE=mean_absolute_error(y train, y_0Ti_predtr) DT1_viscore=r2_score(y_val,y_DT1_predvl) DT1_vIRMSE=np.sqrt(mean_squared_error(y_val, y_DT1_predvl)) DT1_vIMSE=mean_squared_error(y_val, y_DT1_predvl) DT1_vIMAE-mean_absolute_error(y_val, y_DT1_predvl) DT1_d#=pd.DataFrame({ "Method! :['DT1'], 'Val Score’ :DT1_v1score, 'RMSE_vl': DT1_v ARMSE, 'MSE_vl': DT1_vINSE, 'MAE_vl': DT1_vINAE, train Score’ :DT1_trscore, ‘RMS E_tr': DTI_ErRMSE, 'MSE_te': DTI_trMSE, 'WAE_tr': DT1_trMAE}) Compa_df = pd.concat([Compa_df, DT1_df]) conpa_d¢ Motiod valSeore RMSE SEAR MRS Circo 0 Rog o7se749 ts77s3au6415 ‘eorosrert0 ss0e44ss501 o7GONI2 ‘seasesres1 17 wea Linea 0 hey o7tort7 tereessiort2 1eo4srrevio sseseaaries 0730002 132065180306 17 von! Linear okey 07%8820 te76e0sora6e ‘seaserto sooozene617 ar20Te9 so0s7. 705188 17 noe! 0 fon! 0426008 186505451160. saTa%eTer10 136404385286 O88 oHEO T2071 8 0 SRI 0086480. 266820986555 7tiozerto T80680805216 codes astenzeAi7z6 6.8 0 svme o4se252 totts7aza415 s6sétzHer10 T92876 658665 CASED 1HoDH1.8THB. 35 0 Dri ostaiss 175867376246 soesooterio 100001238551 oesece om8o 2071 8a ‘Above performance of initial Decision tree model shows overfit in training set with 0.99 score and low performance in validation set {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sane3116124, 8:08 PM In [154]: out[154]: capstone projectreport, DT2 = DecisionTreeRegressor(max_depth=10,min_samples_leaf-5) DT2.Fit(X_train, y_train) y_DT2_predtr= DT2.predict(X_train) y_D12_predvl= D12.predict(X_val) #odel score and Deduction for each Model in a Dataframe DT2_trscore=r2_score(y_train,y_DT2_predtr) DT2_trRMSE=np..sqrt(mean_squared_error(y train, y DT2 predtr)) DT2_trMSEsmean_squared_error(y train, y_0T2_predtr) DT2_trWAE=mean_absolute_error(y train, y_0T2_predtr) DI2_vIscore=r2_score(y_val,y_DT2_predvl) DT2_vIRMSE=np.sqrt(mean_squared_error(y_val, y_DT2_predvl)) DT2_vIMSE=mean_squared_error(y_val, y_DT2_predvl) DT2_vIMAE-mean_absolute_error(y_val, y_DT2_predvl) D12_d#=pd.DataFrame({"Method' :['DT2'], 'Val Score’ :DT2_vIscore, 'RMSE_vl': DT2_v ARMSE, 'MSE_vl': DT2_vINSE, 'MAE_vl': DT2_vINAE, train Score’ :DT2_trscore, ‘RMS E_tr': DT2_ErRMSE, 'MSE_te': DT2_trMSE, 'WAE_tr': DT2_trMAE}) Compa_df = pd.concat([Compa_df, DT2_d#]) Compa_df Method Val Score RMSE_vi MSE_MI mac tain RMSE_tr Linear 0 Reg 0.78749 137733.698415 1.897057e+10 93994.455301 0.730112 132958.367261 1.71 Model! Linear- 0 Reg 0.719117 137643.639712 1.894577e+10 93999.441186 0.730082 132963.180396 1.71 Lasso! Linear- 0 Reg 0.718929 137689,597398 1,895843e+10 93992,809617 0.729789 133037.735155 1.71 Ridge? 0 kant 0.425008 196935.451160 3.878357e+10 138494.383286 0.998628 9480.192071 a9 0 SVRt -0,05489 266820.956555 7.119342e+10 183639,593215 -0,046405 261802341726 6.8! 0 SVR2 0.458252 191157.623415 3.654124e+10 132876.663665 0.454410 189041408746 3.5 0 DT 0.542495 175867376246 3.0859030+10 109891.238551 0.998628 9480.192071 8.91 0 DTZ 0.637513 156364,920550 2.4449990+10 102458,587308 0.794647 115977.718333 1.3 Above decision tree model with modified parameter has better performed on the training set and validation set compared to initial decision tree model.But overall decision tree has not performed well than linear regression models. {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm ssn783116124, 8:08 PM capstone projectreport, In [155]: sns.set(style="darkgrid", color_codes-True) with sns.axes_style(“white"): sns. jointplot(x=y_val, /_D12_predvl, kind="reg", colo’ 500000 000000 ‘800000 100000 00000 ‘© 500000 1090000 1500000 2000000 2500000 rice In summary, KNN regressor model and decision tree models have not performed well in comparison with linear regression models Ensemble techniques Boosting and Bag: In [156]: from sklearn.ensemble import GradientBoostingRegressor, BaggingRegressor {ie:1IC:1Users/SAUMYA SINGH/DocumenisiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Mtn see3116124, 8:08 PM In [157]: Out [157]: capstone projectreport, GB1-GradientBoostingRegressor(n_estimators - 208, learning rate = 0.1, random state=22) G61. Fit(X_train, y_train) y_G81_predtr= GB1.predict(X_train) y_G81_predvl-= GB1.predict(X_val) ‘Model score and Deduction for each Model in a DataFrane 681_trscore=r2_score(y_train,y_6B1_predtr) GB1_trRMSE=np. sqrt(mean_squared_error(y_train, y_GB1_predtr)) G81_trNSE-mean_squared_error(y train, y_GB1_predtr) G81_trlAE=nean_absolute_error(y_train, y_6B1_predtr) GB1_vlscore=r2_score(y_val,y_GB1_predvl) GB1_vIRMSE=np.sqrt(mean_squared_error(y_val, y_GB1_predvl)) GB1_vINSE-mean_squared_error(y_val, y_GB1_predvl) GB1_vINAE=mean_absolute_error(y_val, y_GB1_predvl) GB1_df=pd.DataFrame({"Method' :['GB1'], 'Val Score’ :GB1_v1score, ‘RMSE_vl': GB1_v 1RMSE, 'MSE_vl': GB1_vINSE, 'MAE_vl': GB1_vIMAE, ‘train Score’ :GB1_trscore, ‘RMS E_tr': GB1_trRMSE, 'MSE_te’: GB1_trMSE, 'MAE_tr': GB1_trMAE}) Compa_df = pd.concat([Compa_df, GB1_df]) conpa_a¢ Motied valseore RMSE SEM ARMY RS Crear 0 Rey o7se7s9 ts77s3se641s 1e0rosrert0 s9004456501 o7G0HI2 ‘seas8.e251 17 wea Linea 0 Rey o7toti7 toveessiort2 ‘eo4srrevio sssseaatiss oso. 2206018006 17 tone! Linear 0 “rey 0718828 187660807368 ‘oseeserto sooozene617 e720T89 s00H7. 705188 1.7 noe! 0 fon! 0426008 186855451160. s7E%67or10 136404385286 C8828 oHEO 1071 8. 0 SvRi 0086480. 266820986555 7tézerto 180600805016 nodes astenz4178 6.8 0 svme. 458252 tottsTaza415 sesétztert0 T328TSG55665 CASED HoDHLOSTHB. 35 0 OT! ote4s5 175667376246 soRseO%er10 10880 238551 a8 oHEE 12071 8. 0 DTz_ osersts 156964220550. 24440000r10 102458587908 O7OAGAT TSOTT. T1899 19: 0B! o7sziTt tet299H6208 1Asr2sTort0 52624310002 080821 sOHH0A 70598 14 Gradient boosting model has provided good scores in both training and validation sets {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sme3116124, 8:08 PM In [158]: out[158]: capstone projectreport, 8GG1-BaggingRegressor(n_estimators=5@, oob_score= True,randon_state=14) 8GG1.Fit(X_train, y_train) y_BG61_predt y_BGG1_predv. BGG1.predict(x_train) BGG1.predict(x_val) #fodel score and Deduction for each Model in a DataFrame 8GG1_trscore-r2_score(y_train,y_8G61_predtr) 8GG1_trRMSE=np. sqrt(mean_squared_error(y train, y BGG1_predtr)) 8GG1_trMSE=mean_squared_error(y train, y_8GG1_predtr) 8GG1_trMAE=mean_absolute_error(y train, y_BG61_predtr) 8661_vlscore=r2_score(y_val,y_8GG1_predvl) 8GG1_VIRMSE=np. sqrt (mean_squared_error(y_val, y_BGG1_predvl)) BGG1_vIMSE=mean_squared_error(y_val, y_BGG1_predvl) 8GG1_VIMAE=mean_absolute_error(y_val, y_8GG1_predvl) BGG1_d#=pd.DataFrame({*Method' :['BGG1'], ‘Val Score’ :BGG1_vIscore, "RMSF_v1': BG G1_vIRMSE, 'MSE_v1':BGG1_VIMSE, 'MAE_vl': BGG1_vINAE, "train Score’ :BGG1_trscor fe, "RMSE_tr’: BGG1_trRMSE, 'MSE_tr': BGG1_trMSE, 'MAE_tr': BGG1_trMAE}) Compa_df = pd.concat([Compa_df, BGG1_d*]) Compa_dt Method Val Score RMSE_vi MSE_MI mac tain RMSE_tr Linear 0 Reg 0.78749 137733.698415 1.897057e+10 93994.455301 0.730112 132958.367261 1.71 Model! Linear- 0 Reg 0.719117 137643.639712 1894577010 93999.441186 0.730082 132963180396 1.71 Lasso! Linear- 0 Reg 0.718929 137689,597398 1,895843e+10 93992,809617 0.729789 133037.735155 1.71 Ridge? 0 kant 0.425008 196935451160 3.878357e+10 138494.383286 0.998628 9480.192071 ai 0 SVRt -0,055489 266820.956555 7.119342e+10 183639,593215 -0,046405 261802341726 6.8! 0 SVR2 0.458252 191157.623415 3.654124e+10 132876663665 0.454410 189041408746 3.5 0 DT 0.542495 175867376246 3.0859030+10 109891.238551 0.998628 9480192071 8.91 0 DTZ 0.637513 156364,920550 2.4449990+10 102458,587308 0.794647 115977.718333 1.3 0 © GB1_ 0.782471 121129.980228 1.467247e+10 82624.319932 0.820821 108334.766538. 1.1 0 BGGI 0.769319 124738.101557 1.555959e+10 80102,360544 0.966466 46867.181534 2.1! Bagging model also performed well in training and validation sets. There seems to be overfitting in training set. ‘We need to analyse further by hypertuning {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sere3116124, 8:08 PM capstone projectreport, Random forest In [159 from sklearn.ensemble import RandonForestRegressor {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sen7e3116124, 8:08 PM In [169]: out[160]: capstone projectreport, RF1-RandomForestRegressor() RF1.fit(X_train, y train) y_RF1_predtr: y_RF1_predvl.: RF1.predict(x_train) RF1.predict(Xx_val) #Model score and Deduction for each Model in a Dataframe RF1_trscore=r2_score(y_train,y_RF1_predtr) RF1_trRMSE=np..sqrt(mean_squared_error(y train, y_RF1_predtr)) RF1_trWSE=mean_squared_error(y train, y_RF1_predtr) RF1_trMAE=mean_absolute_error(y train, y_RF1_predtr) RF1_vIscore=r2_score(y_val,y_RF1_predvl) RF1_vIRMSE=np.sqrt(mean_squared_error(y_val, y_RF1_predvl)) RF1_vINSE-mean_squared_error(y_val, y_RF1_predvl) RF1_vINAE=mean_absolute_error(y_val, y_RF1_predvl) RF1_df=pd.DataFrame({"Method' :['RF1'], 'Val Score’ :RF1_vIscore, 'RMSF_vl': RF1_v IRMSE, 'MSE_vl':RF1_vIMSE, 'MAE_v1': RF1_vIMAE,'train Score’ :RF1_trscore, ‘RMSE _tr': RFI_tPRMSE, 'NSE_tr': RF1_trMSE, 'MAE_tr': RF1_trMAE}) Compa_df = pd.concat([Compa_df, RF1_df]) Compa_df Method Val Score RMSE_vi MSE_MI mac tain RMSE_tr Linear 0 Reg 0.78749 137733.698415 1.897057e+10 93994.455301 0.730112 132958.367261 1.71 Model! Linear- 0 Reg 0.719117 137643.639712 1.894577e+10 93999.441186 0.730082 132963.180396 1.71 Lasso! Linear- 0 Reg 0.718929 137689,597398 1,895843e+10 93992,809617 0.729789 133037.735155 1.71 Ridge? 0 kant 0.425008 196935.451160 3.878357e+10 138494.383286 0.998628 9480.192071 a.ai 0 SVRt -0,085489 266820.956555 7.119342e+10 183639,593215 -0,046405 261802341726 6.8! 0 SVR2 0.458252 191157.623415 3.654124e+10 132876.663665 0.454410 189041408746 3.5 0 -DTt 0.542495 175867376246 3.0859030+10 109891.238551 0.998628 9480192071 8.91 0 DTZ 0.637513 156364,920550 2,.4449990+10 102458,587308 0.794647 115977.718333 1.3 0 © GBt_ 0.782471 121129.980228 1.467247e+10 82624.319932 0.820821 108394.766538 1.1 0 BGGI 0.769319 124738,101557 1.555959e+10 80102,360544 0.986466 46867.181534 2.1! 0 —-RF1 0.754483 128686.871977 1.6560310+10 82901.717082 0.954362 54674.891629 2.91 Random forest model has performed well in training and validation set. There is scope of further analysis on this model {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 201783716724, 8:06 PM capstone projectreport, Enseble models: in summary ensemble models have performed well on training and validation sets. These models will be selected for further analysis with hypertuning and feature selection In [161]: #feature importance rf_imp_feature_1-pd.Datarrame(RF1.feature_importances_, columns = [“Imp"], ind ex = X_val. columns) rf_imp_feature_1.sort_values(by="Inp" ,ascending-False) rf_imp_feature_1[‘Imp"] = rf_imp_feature_1[‘Imp'].map("{0:.5f}" format) r#_imp_feature_1=rf_imp_feature_1.sort_values(by="Imp" ,ascending-False) rf_imp_feature_1.Inp=rf_imp_feature_1.Inp.astype("float") rf_imp_feature_1[:30].plot.bar(figsize-(plotsizex, plotsizey)) 4First 26 features have an importance of 98.5% and first 30 have importance of 95.15 print("First 20 feature importance: \t", (rf_imp_feature_1 print("First 30 feature importance: \t", (rf_imp_feature_1 20]. sum())*180) 30] .sum())*180) First 20 feature importance: Imp 98.184 dtype: floates First 30 feature importance: Imp 95.098 dtype: floates = i aes oa aro im Lae A a aac nEEeT ees si gee ER ETAN EE DE 1404 PRLEF EEE RFR GES LR eC ZER TG Yigi esapeipebbaeaspgi Pappas agit “fe pss B Fe Sa | @ § a8 z g 3 aé ‘Above are top 30 important features that account for 95% of variation in model. This need to be further analysed during hypertuning of the models for better scores {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm aire3116124, 8:08 PM capstone projectreport, Model performance Summary: Ensemble methods are performing better than linear models. Of all the ensemble models, Gradient boosting regressor is giving better R2 score. we identified top 30 features that are explaining the 95% variation in model(Random Forest). Will further hypertune the model to improve the model performance. Will further explore and evaluate the features while hyperturning the ensemble models {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm ante3116124, 8:08 PM Building Function/Pipeline for models In [162]: rf_imp_feature_1[:30] out [162]: furnished_4 yr_buitt living_measure living_measurets quality HouseLandRatio lot_measuret5 City Bellevue coll_measure quality. 8 total_area lot_measure City Seattle City_Kirkland City Federal Way ity_Kent City_ Mercer island sight 4 quality_7 basement City_Redmond coast_1 City_ Medina quatit 10 Gity_Renton room_bed 4 City_Maple Valley City_Sammamish sight 3 ity_lssaquah {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm imp 0.28448 oazar 0.09463 0.06691 0.05062 0.04008 0.03731 0.02532 0.02459 0.02049 0.01527 0.01319 0.01268 0.01245 0.01226 0.01089 0.01047 0.00945 0.00942 0.00908 0.00830 0.00648 0.00856 0.00545 0.00821 0.00393 0.00388 0.00379 0.00351 0.00303 capstone projectreport, aunts3716724, 8:06 PM capstone projectreport, In [163]: from sklearn.pipeline import Pipeline In [164]: def result (model, pipe_model,x_train_set,y_train_set,X_val_set,y_val_set): Pipe_model.fit(x_train_set,y_train_set) apredicting result over test data y_train_predict= pipe_model.predict(x_train_set) y_val_predict= pipe_model.predict(x_val_set) trscore=r2_score(y_train_set,y_train_predict) ‘trRMSi viscore-r2_score(y_val,y_val_predict) VARMSE=np.Sqrt(mean_squared_error(y_ val, VAMSE=nean_squared_error(y_val,y_val_predict) vIMAE=mean_absolute_error(y_val,y_val_predict) result_df=pd.DataFrame({‘method* ANSE, “MSE_val' :VIMSE,‘MSE_v1': vIMSE, train Score’ MSE, ‘MAE_tr': trMAE}) return result_df Above function will run the model and return the r2 score,rmse,mse of the model {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm al_predict)) model], ‘val score’ score, "RMSE_tr’ p-sqrt(mean_squared_error(y_train_set,y_train_predict)) ‘trMSE=mean_squared_error(y_train_set,y_train_predict) ‘trMAE=nean_absolute_error(y_train_set,y_train_predict) ‘score, 'RMSE_val" :v1 ‘trRMSE,‘MSE_tr': te zante3116124, 8:08 PM capstone projectreport, In [165]: | #Creating empty dataframe to capture results _dff=pd.DataFrame() pipe_LR = Pipeline([(‘LR', LinearRegression())]) d.concat ([result_dff,result(‘LR' ,pipe_LR,X_train,y_train,x_val,y_v pipe_knr = Pipeline({(*KNNR", KNeighborsRegressor(n_neighbors=4,weights="dista nce'))]) result_dff=pd. concat([result_dff,result(*KNNR' pipe _knr,X train,y train,X val, y_val)]) pipe_DTR = Pipeline([(‘DTR', DecisionTreeRegressor())]) result_dff=pd.concat([result_dff,result('DTR" , pipe_DTR,X_train,y_train,X_val,y -val)]) pipe_GBR = Pipeline([("GBR’, GradientBoostingRegressor(n_estimators = 200, lea rning_rate = @.1, random_state=22))]) result_dff=pd. concat([result_d#f,result( GBR’, pipe_GBR,X_train,y_train,X_val,y -val)]) pipe BGR = Pipeline([("SGR', BaggingRegressor(n_estimators-5@, oob_score= Tru e,random_state=14))]) result_dff=pd. concat([result_dff,result( ‘BGR’, pipe_8GR,X_train,y train,X_val,y -val)]) pipe_RFR = Pipeline([(‘RFR’, RandonForestRegressor())]) dd. concat ([result_dff,result(*RFR", pipe RFR,X_train,y train,X_val,y result_atf out(265): Method 22! RMSE_val—SE.val st MMSE tr OLR avieras Tar/easoeate ‘esrGsrevt0 Tes7oe7evto o7aoii2 TenesesorE=1 178r 0 KNWR 0425008 18658645160 a7e%E7er10 A7EI57e+t0 Ose8eab — s4so.iezo71. 8.987 0 IR asarate 176errarswsr s.trss0evt0 a.214800et0 aseecze saso.ezori .9er 0 oR aven4rs s2tianseezz8 14srzsrevio.as7247eetO 0820621 108834769896. 1.179 0 BOR a76est9 12478801887 1s8886G0rI0 1EEES58e+10 866456 468E7.11894 2.108 0 ER 057473 ‘270.7882 1638Hbt0vt0 e3E86tert0 assse8o ost azzse 2922 Above sequence of steps with pipeline function will run all the models and compile the scores in result_dff dataframe. We can see that the above 2 steps are concise instead of running individual models and compiling the scores as earlier. We can clearly see gradient boosting is giving better result in comparison with other ensemble methods. Also the score of 0.82 on training set indicates no overfitting of the model {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 2sii783116124, 8:08 PM capstone proj In [166]: | #Storing results of initial data set - off result_ds1-result_dff.copy() ject report result_ds1 out [166]: Method 2! RMSE_val © MSE_val mse MM RMSE tr © LR 0.718749 197733608615 1.897057e+10 1.897057e+10 0.730112 132958.967261 1.7671 0 KNNR 0.425008 196935.451160 3.878357e+10 3.8783670+10 0.998628 9480.192071 8.987: © DIR 0.537219 176677:375867 3.121490e+10 3.1214908+10 0.998628 9480,192071 8.987 © GBR 0.782471 121129.989228 1.487247e+10 1.467247e"10 0.820821 108334,766538 1.173 © BGR 0.769319 124738.101557 1.555959e+10 1.5559590+10 0.966468 46867.181594 2.196! © RFR 0.757473 127900.773592 1.635861e+10 1.635861e+10 0.955380 54061.682258 2.922 » FEATURE SELECTION (PCA) Now, we will explore the possibilty of features reduction using PCA In [167]: df#.shape out[167]: (18287, 91) ‘e:IC:Rsor!SAUMYA SINGHIDocumen'siPP_RE_01 - House Pelee Predicton'Salt/capstone projectreport. 2619783116124, 8:08 PM In [168]: dff.columns out[168]: Index(['price’, "yr_built', "HouseLandRatio", ‘room! ‘room_bed_4*, ‘room_bed_: "room_bed_9", ‘room_bed_: "Living_measure’, “Living_measure1s", capstone projectreport, “lot_measure’, ‘ceil_measure', ‘basenent’, lot_measureis", ‘total_area’, bed_1", ‘room_bed_2", ‘room_bed_3", 5", ‘room_bed_6', ‘room_bed_7', ‘room_bed 8", 1, ‘room_bed_11', ‘
[email protected]
", ‘
[email protected]
", ‘room_bath_1.0", ‘room_bath_1.25', ‘room_bath 1.5", "room_bath_1.75", ‘room_bath_2.@", ‘room_bath 2.25’, ‘room_bath 2.5", ‘room_bath_2.75", ‘room_bath_3.0", ‘room_bath 3.25’, ‘room_bath 3.5", ‘room_bath_3.75", ‘room_bath_4.0", ‘room_bath 4.25’, ‘room_bath 4.5", ‘room_bath_4.75", ‘room_bath_5.0", ‘room_bath 5.25’, ‘room_bath_5.75', "ceil 1.5", ‘ceil 2.0", ‘ceil 2.5", ‘ceil_3.0°, ‘ceil 3.5", ‘coast_1', ‘sight_1', ‘sight_2', ‘sight_3', ‘sight_4", ‘condition_2", ‘condition_3', ‘condition 4", ‘conditions’, ‘quality 4", ‘qualitys', ‘quality_6', ‘quality 7', ‘quality_8', ‘quality_9', ‘quality_1e', "quality_11', "City_Black Diamond’, "City_Enumclaw", "City Kenmore’, ‘City_Kent', ‘City_Kirkland’, "City_Mercer Island’, "City Medina’, a, "City_Renton’, "City_Vashon’ , "has_renovated_Yes'], dtype="object') will drop the price column as it is the target variable "quality_12', “City_Bothell', “city_Fall City’, “City _Sammamish’, "City Woodinville’, “furnished_1", “City_Carnation', “City_Federal Way’, ‘City_Bellevue', *city_Duvall', “city_Issaqua "City_Maple Valley’, "City_North Bend’, ‘City Redmon “City_Seattle’, ‘City Snoqualmie’, ‘has_basement_Yes', In [169]: df_pea = dff.drop(['price'], axis = 1) In [170]: numerical_cols = df_pca.copy() numerical_cols. shape out[17e}: (18287, 90) {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm amie3116124, 8:08 PM capstone projectreport, In [171]: | # Let's first transform the entire X (independent variable data) to zscores. # We will create the PCA dimensions on this distribution. from scipy.stats import zscore # As PCA for Independent columns of Numerical types, Let's pass numerical_cols (16 numerical features) numerical_cols = numerical_cols.apply(zscore) cov_matrix = np.cov(nunerical_cols.T) print( ‘Covariance Matrix \nXs’, cov_matrix) Covariance Matrix Xs [[ 1.@0005469 @.20028185 @.84597846 ... .05257785] [ @.20028185 1.90005469 0.1663024 -0,00617414] [ @.84597846 @.1663024 1. 89005469 0.01739462] [ @.01415428 .08035946 0.01649371 -0.01445085] [ @.20094885 -0.02988448 -2.27730605 @.04524435] 0.08035946 -0.02988448, @.01649371 -0.27730605 1,00005469 -0.0056238 + -®.0056238 1.00005469 [ @.05257785 -0.00617414 @.01739462 ...
[email protected]
0.64524435 1,0005469]] As we can see, near the value to 1, more the features related. {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 0.01415428 0. 20094885 zane_anai24, 8:06 Pw capstone projectreport, In [172]: eigenvalues, eigenvectors = np.linalg.eig(cov_matrix) print(‘Eigen Vectors \nXs", eigenvectors) print('\n Eigen Values \nXs', eigenvalues) Eigen Vectors %s [[ 3.38140157e-01 -5.91272225e-02 2.10933458e-01 ... -5.27282174e-03 5.54142192e-@3 -1.67124034e-04] [ 7.12659835e-02 -4.34121260e-01 -8.88436080e-02 ... -1.68818774e-02 4,57078107e-03 -8.52995967e-03] [ 3.49772357e-01 -8.00383876e-03 -4.05156781e-02 ... -4.68496505e-03 -1,39683527e-02 2.43243338e-03] [ 1.29688720e-02 -3.8039856e-02 -3.41813657e-02 ... 1.07174062e-01 8.41737918e-02 -1.95413367e-01) [-2.50075399e-02 -3.74622282e-02 4.39476539e-@1 ... 3.99982110e-03 4,08181763e-02 2.32617269e-02] [-3.18537004e-03 -6.23000043e-04 1.02661626e-01 ... 2.84579669e-02 -1,47963772e-02 -6.62692406e-02]] Eigen Values %s [ 6.40030103e+00 4.23053272e+00 3.02200570e+00 2.36069955e+00 1.72278028e+00 1.70533047e+00 5.17634008e-02 7.8486425Se-02 1,23323929e-01 1,58239483e100 1.94704947e-01 2.10588552e-@1 2.45372409e-01 3,37764061e-01 3.52383334e-01 2.24756725e-03 9,93351422e-04 1,28503648e-04 8.54683326e-05 1.51669793e+00 3.97816689e-01 1,48400510e+00 4.25049450e-01 -5.20329656e-16 -1.83229560e-15 3.57406920e-15 1.39212554e+0@ 1.33812387e+00 5.71411667e-@1 6.48215227e-01 6.60453404e-@1 1.27455883¢100 6.90208644e-01 7.3090@85Se-01 1.22358633e+00 1.21781188¢+00 7.54916613e-@1 7.61951753e-01 7.89272221e-@1 1.19439921¢+00 1,18354682e+00 8.08765828e-01 8.31761100e-@1 1.17521503e+00 1.16073113e+0@ 8.62975337e-01 1.14847039e+00 8.79158894e-01 1,11948938e+00 1.10960276e+00 8.90644524e-@1 8.88567656e-01 9.01761603e-01 1.10493861e+00 9.16012433e-@1 9.31041146e-01 1.09143428e+00 1.08460485e+00 1.08273453e+00 1.07118893e+00 9.33793856e-01 1.06368893e+00 9.41694315e-@1 9.44273389e-01 9.49801385e-01 9.52927340e-01 1.05455299e+00 1.04955645e+00 1.04815072e+00 1.04163633e+00 9.69808694e-01 1.03813696e+00 1.03345195e+00 1.02768165e+00 1.02381893e+00 9.82887562e-01 9.81254198e-01 9.86986081e-01 1.01521697e+00 9.89390243e-01 9.93680575e-01 9.93261992e-01 1.00195909e+00 1.01363137e+00 1.01189827e+00 1.00051625e+00 1.00419597e+00 1.00622516e+00 1 -00552678e+00 1.00928050e100) {ie:1IC:1Users/SAUMYA SINGH/DocumenisiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Mtn zane3716724, 8:06 PM In [173]: capstone projectreport, # Let's Sort eigenvalues in descending order # Make a set of (eigenvalue, eigenvector) pairs eig_pairs = [(eigenvalues[index], eigenvectors[:,index]) for index in range(le n(eigenvalues))] # Sort the (eigenvalue, eigenvector) pairs from highest to Lowest with respect to eigenvalue eig_pairs.sort() eig_pairs.reverse() print (eig pairs) # Extract the descending ordered eigenvalues and eigenvectors eigvalues_sorted = [eig_pairs[index][0] for index in range(len(eigenvalues))] eigvectors_sorted = [eig_pairs[index][1] for index in range(len(eigenvalues))] # Let's confirm our sorting worked, print out eigenvalues print( ‘Eigenvalues in descending order: \nks' %eigvalues_sorted) {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 30/1783116124, 8:08 PM capstone projectreport, [(6.400301029851477, array([ 3.38140157e-01, 7.12659835e-@2, 3.49772357e-0 1, 1.21826278e-02, 2,45688226e-01, 3.13509594e-01, 5.85711636e-02, 1.32118243e-01, 1.24188241e-01, -5.39557358e-02, -1.55322767e-01, ~7.38082454e-02, 2,69280354e-01, 6.71341866e-02, 1.76807150c-02, 8.40382554e-03, 2,25638942e-03, 3.61933268e-03, 1.42331836e-03, 5.172604760-04, ~5.638039840-03, -2.85596080e-02, -2.20914937e-01, -2.56073225e-03, ~6,59873013e-02, -6.94849495e-02, -5.10691490e-02, 2.99009064e-02, 1,96828062e-01, 7.27758185e-02, 4.47037234e-02, 6.43544191e-02, 8.98255374e-02, 3.57441062e-02, 2.39042699e-02, 1.82629160e-02, 2,35903040e-02, 9.52118613e-03, 6.36445584e-03, 6.71514053e-03, 4.22319789e-@3, -8.25472478e-02, 2.65712150e-01, 2.24314600e-02, 1.4ag@Ba64e-03, -1.59380115e-4, 1.36351362e-02, 1.21584859e-02, 2.79651538e-02, 3.46589095e-02, 2.87868177e-02, -3.59443998e-02, 1.23735329e-01, -9.23789546e-02, -5.48045043e-02, -2.04647878e-02, 5.44353495e-02, -1.55877412e-@1, -1.66909198e-01, 9.78807036e-02, 2,09856238e-01, 1.36794862e-01, 5.65981168e-02, 8.75831765e-03, 2.64286457e-01, 4.83149865e-02, -8.63848741e-03, 2.05515336e-02, 1,04425520e-02, 2.46487074e-02, -1.36322635e-02, -8.34913899e-03, 5.58571616e-03, 5.97895799e-02, 1.67219076e-02, 1.73286710e-02, 4.02906127e-02, 1.35579146e-02, 3.20240017e-03, 5.54293286e-02, 2. 79650623e-02, 1.80305747e-01, 5.49209909e-02, -5.88852015e-03, -2.5075399e-02, -3.18537004e-03])), (4.230532715020! 0.43412126,
[email protected]
, -0.09492954, 0.13056396, -0.09447739,
[email protected]
, -0.42105132, @.38853903, 0.00818031, @.09903886, -2.00322975, -2.05039931, -0.0397953 , -0.01135684, -0.00195537, @.00187349, 9.00561416, -0.00403777, @.00329149, @,00403127, -0.01266106, -0.02264596, 0.00982359, 0.00641089, -0.1047631 , -.01951@81, -0.01220764, 0.1019734 , -0.03¢86617, 0,0212307 , 0.04710655, 0.04081646, -0.00117921, 0.00207246, @.00066914, -0.00061282, -2.00468101, -0.00095167, .00073386, 0.000858 , -0.00237293, 0.15398148, 0.03143851, .20703993, @.0235899 , -0.04163645, -2.02784399, -0.01428115, -2.01787219, -0,04086905, -0.02070046, 0.21897162, -8.20078997, -0.05272298, -0.00838391, -0.0126264 , -0.02109097, -8.07477947, 0.10406576, @.01781302, -0.03510238, -2.02017662, -0.00081656, -2.00543061, -0.12229314, -0.02405586, -0.01865954, -0.03518784, -0.00616237, -0.02599962, -0.02240527, -0.06022892, 0.0647382 , -0.03107846, -0.03563074, -0.05546802, 0.03231613, -0.02937766, -0.07517532, -0.03530176, 0.00217373, -0.04298141, -0.03445572, 0.1795959 , 0.03400426, -0.04414761, -2.03803986, -0.03746223, -2.000623 ])), (3.0220056962108997, array({ 0.21093345, -0.08884361, -0.04051568, 0.459233 1, -0.20948573, 0.06094638, -0.08769888, -0.04338803, 0.18999279, -0.03429171, -0.04303535, -0.14081992, @.09630094, 0.14413304, 9.08075213, @.03087204, 0.021635 , 0.02874812, @.01352604, .01099741, -0.00046969, -0.0253926 , -0.12810462, 0.00801375, -0.00546132, @.08028745, 0.05174082, 0.02460706, -0.17141934, .10063382, @.10513547, 0.08968952, 0.09751819, 0.04988048, 0.04180795, @.0360652 , @.04631527, 0.01510422, 0.00364895, 0.00749181, @.00784258, 0.14066273, -0.13476605, 0.058906 , 0.01324093, @,00143509, 0.04433479, 0.0745123 , @.11139792, @.10282392, @,08280422, -0.02017733, -0.15120561, 0.08089571, 0.1413166 , -0.03014322,
[email protected]
, -0.07345488, -0.00271769, 0.02248996, @.03418131, 0.03990794, 0.02663322, 9.01611412, 0.05546425, 0.03544567, -0.03015266, -2.01667398, -0.0401713 , -2.05250393, -26648246e-02, -06324757e-02, -07319938e-01, -29688720e-02, ‘1, array([
[email protected]
wnpuneee 2722, {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm sure3116124, 8:08 PM -0,04129285, -0,10458144, -0.056401 , -0.05563353, (2.3606995503606907, 8, -0.20077718, 0.05062147, 0.26402065, 0.01216159, 0.01828875, -0.17577786, 001273173, 0.03028663, 0.20706833, -0,00721205, @.11456154, @.07277432, 224099786, -0,03052305, -0,00267958, -0,10262501, -0,03335646, @.00513287, (1.7227802802980245, 2, 5.16876256e-02, 4,84901490e- 2.21522986e~ -6.46767863e- 2.57882460e- 1.16979339e~ 1,00421405e~ 1,05193372e~ -2,18796507¢- 5.98256022e~ 4,41525866e- -5,32191501e- 1.73155089e~ -5.78750293e- 5..27653264e~ 0.03835547, 09678647, 1074207 -0.01259858, -0.02191463, 010503604, -0.01547844, .0258032 , -0.00839736, -0.00625977, 304967, 0 -0 -0.02118915, -0.01955768, -0.05468041,, -0.00285831, array ([-6.00720377, 0.01439989, -0.27661743, -0.00804137, 0.09307033, -0.06178217, 0.05388027, 002775791, @.10759155, @.10292123, @.0672181 , @.13346116, 216208952, 001273904, @.02118784, -0.05898896, -.03231935, 003516863, array([-2.75891943e-02, 02, 04, 03, 22, 1, 22, 21, 21, 22, 22, 22, 22, 03, 02, -0,049328 @.07094793, 044510897, -.00250479, -0.00796351, e.e0asa758, 012430449, .044399 -0.10453813, 025531019, 1.72202394¢-02, 1.75531777e-01, ~6.99276559e-02, -1.38633482e-02, 1.31501431e-02, 2,03529963e-01, -7.22222733e-02, -9.70041136e-03, 1,12053923e-02, -1.57737740e-02, -1,82694555e-02, -3.03469013e-03, 6.17820745e-02, -5.39089932e-03, 2.06938171e-02, -1.19344878e-01, 1,86698328e-02, 3.70161361¢-02, 8.36701642e-02, -5.79924748e-02, 8.151025610-02, -2.36634142e-01, -3.34803092¢-02, 2.773702460-02, 3.65080662e-02, 3.01403507e-02, -3.87528058e-02, 2.91986845¢-02, -2.82725675e-02, 8.14799322e-02])), (1.7053304684180373, array([-9. capstone projectreport, -0.04979088, -0.04044887, -0 -.10198589, @.00537009, @ -0.08192034, -0.05254563, @ -.03418137, 0.43947654, @ .00224618, @.00073809, -0.07253922, @ @.05624118, 6.02404274, 0 @.00257074, -0.00579735, 0 @.32371789, @.01903027, -@ -0.14789745, -8.06749212, @ 0.072458 , 0.05551253, @ @.0149579 , -0.00807084, 0 @.00949538, .0310457 , -8 2.013677 , @.03206102, @ @.02340023, -0.05185769, @ @.2975113 , -8.2129159 , -8 @.06896703, 0.00896484, 0 -0.03055353, 0.01560851, -0
[email protected]
, -0.02238202, -0.
[email protected]
, 0.01533014, 0
[email protected]
, 0.10004965, @ -0.02673597, -0.16413819, 0 6.22663979e-02, 1.03722364e-01, 5. 7.48912126e-02, 3.87175860e-02, -1.18895907e-02, -2.22415917e-02, -8,15912951e-02, 7,362202960-03, 2.37479559e-02, 1,58457297e-02, -1,@5065660e-02, 3.82937199e-01, 3.50285374e-01, -6,94195526e-02, -3,43044936e-01, 3.91109747¢-02, 1.42157352e-02, -7.91148709¢-03, 2.24997611¢-02, 6.e6045961e-03, -1.62743839e-02, 1.74298356e-01, 1. -2. 1. 1. “1. -8. 4. 1. 4. 3. 2. “1. 0.20682403, @.0279412 , -0.38919037, @.e1319458, -0.02695272, -0.01146529, 207892846, @.01198528, 005859151, 2.00308563, 2.06603838, 0.01134168, -2. -0.0100558 , -@ -0.0099384 , @ 0.04413847, 0. -0.12597964, -@ 0.04778133, 0. 000248636, @ 0.02176298, 0. 0.02034035, 0. {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 0.07573216, 021319, 0366488 26984296, |.10266163])), 0.145471 14716631, 00526146, 0020183 , 05286308, 02029172, 03184281, 00573534, 05454233, 24935804, 01734833, 23715036, |. 30668776, 02733405, 04074563, 04045805, |. 16962266, -@8959676])), -5.87154230e-0 34730987e-02, -2,40660821e-02, 78451840e-02, 37987660e-03, 4.43677778e-02, 25976552e-01,, 20741003e-02, 610449220-03, -9.48287976e-04, 78509029¢-03, 2.087404160-02, 35659169e-02, 5@096333e-01,, 55724591¢-01,, 62507673¢-03, 65573237e-03, 3.45710951¢-02, -6.09415475¢-02, 2.43959995e-02, 11980582e-02, 4.41241036e-03, 03214826, 00243799, 00592628, 07172762, 10725771, 04080463, 01644248, 14312427, 07405128, 3276_anai24, 8:06 Pw @. 0. @. 0. 0. 0. 8. 0. (1,5823948341725016, 25196789, 22796998, 12769751, 03815552, 00507272, 03981033, 03969387, 01284676, 7, -8.01142927, @. -0. 0 8. @. -0 -0 0 @ @. -0 @ @. @. @. 0. 0. (1.5166979298471606, 04732042, 05834357, 00748711, 01127987, 01776535, 08995488, 00506415, 01695773, 29645225, 11947425, 07828787, 13986082, -17926009, 02650792, 03097681, 06511834, 03140204, 6, -0.0823675 , 0. 0. @. 0. -0. @. -0. @. -0. @. 0. 0. 2. @. @. 0. @. (1.4840050960170488, 01772739, 6798112, 02817224, 02279888, 03952647, 01929373, e1iss9s8, 01212473, 20819852, 39971337, 12236071, 1¢684081, 27205484, eaaoass , 1882041, 21529219, 25796855, » 0.03689281, 0 -0 -0 @. @. 0 0 0 0 00897316, 14582129, 04212032, 02341597, 2429155 02356482, 01161743, 0362343 01584929, capstone projectreport, -0.03486212, 0. -0.07348146, -@. 015962844, 0. 0.00385181, 0. 0.00141885, -0. 0.05873867, -0. -0.04650538, -0. 0.07491223, 0. array([-0.02579091, -0.10323643, -0.07678036, -0. @.22674486, -2. -0.06773762, -@.. -0.07419635, -@. -0.01940218, 0. @.07290427, -@. -0.04276741, 2. @.06618993, -2. 0.1060034 , -0. -0.01927545, -@.. .00855355, 2. @.02992472, @. 0.00538621, -2. -0.0351676 , @. -0.03501136, -@. e.cea7saas, 2. 0.02211096, -@. array([ 03512953, -¢.05998021, -0.04739566, -0. -0.1294a801, 0. 0.02324206, 0. -0.08033275, -0. 0.13790474, -0. -0.1169406 , -0. 0.08773861, -0.. 0.23575178, 0. 041200847, -0. -0.05494921, -0. -0.02983097, -0. -0.04685742, 2. -0.02058402, -0. -0.03878011, -@. -0.01572402, @. -0.05176724, @. @.11527288, -@. -0.04864121, -@. -0.01310922, 2. -0.00445327, -@. 0.51007806, -@. 0.02903634, -@.. @.00522301, -@.. -0.06236782, -0. -0.02652322, 0. 0.00656723, 0. 21168428, 29426926, 10054937, 01635949, 0511901, e4sa2as2, 12865868, 04722783, 10237832, 12169029, es1sa6s4, 03121154, 10655432, e4as1162, 1211364, 1794634, 02057187, 48625332, 24908957, 01678794, 05599342, ee16s846, 0624873 , 2232029 , 04893219, 4986913, 22644149, 03927625, 08179817, 1450146 , 12399375, 1351911, egasasaa,, 10110496, 07489579, 15292934, 03030548, 01707269, 8398554, 13624518, 05367334, e4911701, 27710089, 06974777, 02754135, 16105521, 02010313, 00311867, 2080576 , 03201913, 1599246, -0.15449296, .18688518, 0.0230538 , 0.02aaaze2, -0.0331081 , 0.01465507, 0.08653509, 0.05148136, @.11412126,
[email protected]
,, 0.0104614 , 0.04688368, 9.007984 , -0.00111173, -0.06485362, 0.07672332, -0.e020006 , 2.a84a0686, -0.20662059, -0.00525954,, -0.0716547 , 8.09514562, -0.00370757,, -0.00802633, -0.14940765, 0.017378, -0.07273589, .01264389, 0.02656226, 0.129426 , 0.02649847, 0.0249923 , -0.02308546, -0.11640363, -0.00541504, 0.2874068 , -0.03453339, -0.05880667, -0.07418455, -0.02676439, -0.00107446, -0.15494206, 0.05563915, -0.01969408, .01184245, 0.03972466, 0.05042578, -0.02191703, -0.02258844,, -0.00857236, -0.04447129,, {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm -0 -6 @ -6 -6 @ a @ 0.06233093, -8. -8. -8. e. -8. 03052541, 03264406, 18427271, 02221682, 0765737 08991486, -14152616, 08164325, 07842145, -11001634, 05344581, -12861492])), -0.1215128 -0 -0 @ @ @ @ @ -0 -0 @ -0 -0 0. 10666813, -0. a. a. @. a. a. e. “0. -0. e. -0. -0. e. -2. -8. -0. @. array([-0.01647521,
[email protected]
,
[email protected]
, @ -0 -0 @ @ -6 -0 -6 -6 11058998, 28658732, 21152038, 0144477 01035132, 01155537, 08155747, |.@8652505])), 0. 1563163 06317361, 29161384, 3208893, 4296296, ese09aa4, 07612813, 26650317, 0187637 , 07572413, 0847142, 03841585, 21544068, 8859317, 04605829, 15577549, 11035494, 11326956, 1668012, 2799173, 5427701, 0264562, 11725425])), |. 38743286, 06385795, 0039486 , 0224396 00409658, 04233229, 02122084, 02097777, 00803429, 0.0359174 sare_anai24, 8:06 Pw capstone project -0.01061251, @.10817242,
[email protected]
, 0.08557665, 9.0460519 , 0.47181799, @.16525984, -2.20280355, 0.17355343,
[email protected]
, 0.1237945 , -0.04323163, -2.03597109, -0.00419303, 0.08125443, -0.00628357, @.04920573,
[email protected]
, 0.17418589, -8.21709192, e. 8. -2. port @.00150704, 0.01570768, -0.04528673, 0.04077768, -0.0136485 , @.07800799, 0.04705179, .04622736, -0.02957357,
[email protected]
, -0.04611269, -0.03345367, 0.0315392 , -0.03166596, -0.06843772, @.01994364, 0.02590641, -2.00907726, 0.03793356, -2.06581374])), (1.,3921255385114626, array([-7.44798782e-02, -2.04057215e-02, -1.70213024e-0 1, 1,59969471e-01, 1,16745291e-01, | 1.22570505e-02, -8.60491578e-03, -3,35444803e-02, -1,63747860e-02, -8.81691725e-02, 2.29829604e-01, -1.91980144e-01, 2,00935995e-01, -9.24163310e-02, -3.58916386e-01, -1.55938407e-01, ~8.01419058e-02, -1.67543214e-@1, -1.585900800-02, -3.33998688e-02, 1,420906600-02, -1.01937834e-@1, 3.97490694e-02, 3.114868010-02, =3.28015525e-02, 1.03256895e-01, -1.58237285e-01, 7.35541952e-02, 8.49211152e-02, 3.08494796e-02, -1.74544509¢-01, -8.39287128e-03, 2.78487996e-02, -9.45944424e-02, -1.317263000-01, -4.56532786e-02, -2,77369029e-01, -1.65078423e-02, -1.47011482e-01, -1.20544364e-01, ~1,24713634e-01, -2.12675713e-01, -1.59140924e-06, -1.70492395e-01, ~4.71749996e-02, -2.64371816e-02, 1.93074103e-01, 5.10871246e-03, ~4,51392455e-02, -1.12503289e-02, 1.79238763e-01, -3.64441742e-02, 4,17966706e-02, 9.30520985e-03, -7.65341683e-02, -1.14372474e-01, ~3.05317310e-02, 3.57304046e-03, 6.67812385e-02, -1.18929855e-01, 1.15e80118e-01, -2.72141¢11e-02, -6.35371089e-02, 1.76916211e-03, 7,65631431e-02, 1.08834391e-02, -3.97408678e-02, 9.80381325e-03, ~7.29636808e-02, -3.18072086e-02, -1.22301352e-01, -4.27613032e-02, 5.05198151e-02, 2.49743560e-02, 5.62910547e-02, 8.95831493e-03, 3.61783200e-02, 3.30025629e-02, -7.22496957e-02, -5.24225212e-02, ~3.43834525e-02, 3.517116@1e-02, -1.93835937e-02, 3.89598662e-02, ~3.88440865e-02, -4.38094026e-03, 7.82669060e-02, 3.50383193e-02, 1,99411248e-01, -1.66581180e-01])), (1.338123871891159, array([-0.005 48775, -0.03843159, -0.03061153, 0.04333244, 0.13993639, -.03706519, -0.02074415,
[email protected]
, 6.03333833,
[email protected]
, @.22987535, -0.21522097, -0.06384675, 0.08397588, 0.34627929, =.04155743, 0.15962213, 0.17622303, 0.03179456, -0.00631348, @.01499474, -0.03142747, 0.07360389, 0.03405806, 0.05415617, -0.04012176, -0.21879165, @.10111445, -0.06312339, -0.02052279, @.07695396, -0.03795474, 0.02675084, 0.09266905, .02364962, @,00294061, 0.32037628, 0.04085849, 0.22018367, 0.15760833, -0.87506903, -0.26341749, 0.045752 , 0.046454 , 0.0182172 , -0.00876758, @.02950111, -8.07382824, -0.09627891, -0.03991851, @.05040326, 0.00198497, -2.07373555, 0.24160928, -2.26190678, -0.01717238, 0.05946142, -0.07260429, 0.15436931, -0.14236139, -0.00075114, -0.02015509, 0.12668078, -0.02670258, 0.01202094, 0.07757646, -0.01062797, -2.00859107, -0.03864637, -2.04924234, -0.12007782, -0.04770347, 0.01634402, 0.02039453, 0.00370991, 0.067483, 0.07424346, 0.01216043, 0.06891506, -2.01504565, -0.05261312, 0.05696704, 0.06306901, -0.0986121 , -0.09694006, -0.02580625, 0.00048615, 0.025849 , 0.04712735, -0.18562838])), (1..2745588269563284, array([-1.35554753e-03, -6.07757998e-03, 1.94705449e-0 2, -3.64873032e-02, 2.57177745e-02, -1.73766948e-03, -7.66767391e-03, -5.99583989e-03, 1.28081101e-02, 3.13125071e-03, 7.72361809e-02, -9.08219621e-02, ~3.02525331e-02, 1.04350856e-01, -1.07304336e-01, 5.79268206e-01, ~1,92585215e-02, -1.43456269e-01, 1.44049771e-02, 5.54833874e-03, ~3.02947376e-03, 1.53579737e-02, -1.61490155e-05, 1.64651561e-03, {ie:1IC:Users/SAUMYA SINGHiDocumentsiPP_RE_01 - House Price Predillon/Sol t/capstone projec report. Nm 3an78
You might also like
Property Price Prediction Capstone Project
PDF
100% (1)
Property Price Prediction Capstone Project
7 pages
Sample - Customer Churn Prediction Python Documentation
PDF
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
1) Introduction A) Defining Problem Statement:-: ST ST
PDF
No ratings yet
1) Introduction A) Defining Problem Statement:-: ST ST
10 pages
Help File
PDF
No ratings yet
Help File
92 pages
House Price Prediction: Project Description
PDF
No ratings yet
House Price Prediction: Project Description
11 pages
House Price Prediction 1
PDF
No ratings yet
House Price Prediction 1
27 pages
Capstone Project Report
PDF
No ratings yet
Capstone Project Report
187 pages
Capstone Notes-1
PDF
No ratings yet
Capstone Notes-1
18 pages
FINANCE & RISK ANALYTICS – PROJECT - YARESH VIJAYASUNDARAM
PDF
No ratings yet
FINANCE & RISK ANALYTICS – PROJECT - YARESH VIJAYASUNDARAM
48 pages
Clustering Analysis: Prepared by Muralidharan N
PDF
100% (1)
Clustering Analysis: Prepared by Muralidharan N
16 pages
ML - Project - Business Report
PDF
No ratings yet
ML - Project - Business Report
43 pages
Time Series Forecasting Jupyter Code - Ipynb
PDF
No ratings yet
Time Series Forecasting Jupyter Code - Ipynb
2,484 pages
Project - Finance and Risk Assessment: Submitted By: Navendu Mishra
PDF
No ratings yet
Project - Finance and Risk Assessment: Submitted By: Navendu Mishra
18 pages
Dinya Antony MRA ML2
PDF
100% (1)
Dinya Antony MRA ML2
24 pages
FRA Main Project Part B Guided
PDF
No ratings yet
FRA Main Project Part B Guided
23 pages
Project Time Series Forecasting ROSE Dataset by Somya Dhar 1 PDF
PDF
No ratings yet
Project Time Series Forecasting ROSE Dataset by Somya Dhar 1 PDF
52 pages
Uber Drive Practice DP PDF
PDF
No ratings yet
Uber Drive Practice DP PDF
10 pages
7z1018 CW Example Predicting House Prices in King County
PDF
No ratings yet
7z1018 CW Example Predicting House Prices in King County
16 pages
Anshul Dyundi Machine Learning July 2022
PDF
50% (2)
Anshul Dyundi Machine Learning July 2022
46 pages
Report On Linear Regression Using R
PDF
No ratings yet
Report On Linear Regression Using R
15 pages
House Price Prediction Using Data Science
PDF
No ratings yet
House Price Prediction Using Data Science
8 pages
Prathamesh Shukla SMDM Project 20.08.23
PDF
100% (1)
Prathamesh Shukla SMDM Project 20.08.23
34 pages
Time Series Forecasting
PDF
0% (1)
Time Series Forecasting
1 page
Answer Book - Rose Wines
PDF
100% (1)
Answer Book - Rose Wines
11 pages
SMDM-Business Report
PDF
No ratings yet
SMDM-Business Report
11 pages
Mysql 7-10
PDF
No ratings yet
Mysql 7-10
4 pages
Cold Storage Assignment - Atanu
PDF
100% (2)
Cold Storage Assignment - Atanu
11 pages
Bankruptcy Prevention Project
PDF
No ratings yet
Bankruptcy Prevention Project
16 pages
Project-Time Series Forecasting
PDF
100% (1)
Project-Time Series Forecasting
10 pages
Data Mining Project - 27.06.2021
PDF
No ratings yet
Data Mining Project - 27.06.2021
6 pages
Rahulsharma - 03 12 23
PDF
No ratings yet
Rahulsharma - 03 12 23
25 pages
SMDM Report
PDF
No ratings yet
SMDM Report
12 pages
TSF - Project
PDF
100% (1)
TSF - Project
5 pages
FRA Extended
PDF
No ratings yet
FRA Extended
22 pages
Color: Due On Sunday June 7th, by 11:59PM
PDF
No ratings yet
Color: Due On Sunday June 7th, by 11:59PM
2 pages
FRA Project Report Milestone 1 PDF
PDF
No ratings yet
FRA Project Report Milestone 1 PDF
29 pages
Random Forest - US - Heart - Patients - Class
PDF
100% (1)
Random Forest - US - Heart - Patients - Class
24 pages
Cars Project PDF
PDF
No ratings yet
Cars Project PDF
9 pages
Answer Report: Data Mining
PDF
No ratings yet
Answer Report: Data Mining
32 pages
Machine Learning Project Car Price Prediction Algorithm
PDF
No ratings yet
Machine Learning Project Car Price Prediction Algorithm
4 pages
AS Extended Buisnesss Report
PDF
No ratings yet
AS Extended Buisnesss Report
25 pages
ML Models
PDF
No ratings yet
ML Models
2 pages
Boston Condo Info-Case Study: Click Here
PDF
No ratings yet
Boston Condo Info-Case Study: Click Here
3 pages
Business Report On Data Mining: By: Aditya Janardan Hajare Batch: PGPDSBA Mar'C21 Group 1
PDF
100% (1)
Business Report On Data Mining: By: Aditya Janardan Hajare Batch: PGPDSBA Mar'C21 Group 1
12 pages
Wholesale Custumer
PDF
100% (1)
Wholesale Custumer
32 pages
Assignment 2
PDF
100% (1)
Assignment 2
8 pages
AV Project Shivakumar Vanga
PDF
100% (1)
AV Project Shivakumar Vanga
37 pages
AS Notebook - PCA - Wine Data-4
PDF
100% (1)
AS Notebook - PCA - Wine Data-4
1 page
Data Mining Problem 2 Report
PDF
No ratings yet
Data Mining Problem 2 Report
13 pages
Fra Project Report-Bajaj Auto Ltd. Vs Hero Motocorp Ltd. (Group-X)
PDF
100% (1)
Fra Project Report-Bajaj Auto Ltd. Vs Hero Motocorp Ltd. (Group-X)
10 pages
Data Mining Project - PCA - Hair Salon
PDF
No ratings yet
Data Mining Project - PCA - Hair Salon
8 pages
Mra Project - Milestone1: Student Name: Gowri Srinivasan Batch: Dsba Online Mar 20
PDF
No ratings yet
Mra Project - Milestone1: Student Name: Gowri Srinivasan Batch: Dsba Online Mar 20
30 pages
Extended Project
PDF
No ratings yet
Extended Project
1 page
Predictive Modeling
PDF
No ratings yet
Predictive Modeling
38 pages
Surabhi FRA PartA
PDF
No ratings yet
Surabhi FRA PartA
13 pages
Analyze House Price For King County
PDF
100% (1)
Analyze House Price For King County
15 pages
Ridge - Lasso - Regression (1) .Ipynb - Colaboratory
PDF
No ratings yet
Ridge - Lasso - Regression (1) .Ipynb - Colaboratory
4 pages
message (4)
PDF
No ratings yet
message (4)
2 pages
ML Interview Questions
PDF
No ratings yet
ML Interview Questions
10 pages
Zerox Ready
PDF
No ratings yet
Zerox Ready
21 pages