Predicting House Prices Grades: Business Computing - Ii
Predicting House Prices Grades: Business Computing - Ii
COMPUTING - II
Predicting House
Prices
Grades
GROUP NUMBER 12
RICHA BHARDWAJ MBAA24116
SAYANTAN MBAA24122
SHIVANGI AGARWAL MBAA24124
SNEHA RAMPURIA MBAA24130
PRIYANKA S MBAA24167
DIKSHA ARYA MBAA24090
INTRODUCTION
Our aim is to analyze the King County House Prices Dataset from
Kaggle (https://www.kaggle.com/datasets/gauravduttakiit/predict-the-
house-prices-king-county) to explore key patterns, conduct statistical
analysis, and build predictive models for price estimation.
Numerical Features:
Price, Square Footage
(living, lot, basement),
Number of
Bedrooms/Bathrooms,
Year Built, Grade,
Condition, Floors.
Categorical Features:
Waterfront Presence,
Renovation Status, Zip
Code, View Rating.
Target Variable: Price,
Grade.
DATA PREPROCESSING
ANOVA
Hypothesis: House condition doesn't significantly impact the average sale price.
LINEAR REGRESSION (Price Prediction)
Linear Regression
Model Evaluation
Metrics
Conversion of Grade
to Binary 0 and 1
Model Evaluation
Confusion Matrix for Classification Models
LDA
Clasification Model
LDA : A classification algorithm that is used when the input features are continuous (not
categorical). Predicting Binary Grade.
Model Evaluation
Metrics
Clasification Model
Logistic Regression:
statistical model used for
binary or multi-class
classification. Input
features are categorical.
Predicting Binary Grade.
Model Evaluation
Metrics
We used machine learning models to predict house prices (continuous) and property
grades (categorical) using key input factors like location, size, number of rooms, and
amenities.
For price prediction, we applied Linear Regression (lm) and statistical tests like t-tests,
ANOVA, and Z-scores, along with Random Forest for better accuracy.
For grade classification, we categorized grades as ‘0 or Bad’ (<7) or ‘1 or Good’ (≥7) and used
Logistic Regression (glm) and Linear Discriminant Analysis (LDA) for classification.
We evaluated model performance using RMSE, R-squared for regression and Confusion
Matrix, accuracy scores for classification, ensuring robust and reliable predictions.
Feature importance analysis and correlation visualizations helped identify key factors
influencing house prices and grades, improving prediction accuracy and aiding real estate
decision-making.
Thank you