0% found this document useful (0 votes)
5 views

Okay

Uploaded by

jingskie588
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Okay

Uploaded by

jingskie588
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Cedric Rey B.

Iganacio BSIT 3-A

1. Handling Missing Data:

Explanation: This involves dealing with instances where data is incomplete or missing. It's crucial to
decide whether to remove, replace, or interpolate missing values.

Example: If you have a dataset with missing age values, you could choose to replace the missing values
with the mean or median age of the available data.

2. Outlier Detection and Removal:

Explanation: Outliers are data points significantly different from others and can affect model
performance. Identifying and handling outliers is important for improving model robustness.

Example: If you're analyzing a dataset of product prices and there's an unusually high value, removing
or transforming it can prevent it from disproportionately influencing the model.

3. Data Standardization or Normalization:

Explanation: Standardizing or normalizing features ensures that data is on a similar scale. This is
particularly important for algorithms sensitive to the magnitude of input features, like many distance-
based methods.

Example: If your dataset includes features with different units (e.g., height in meters and weight in
kilograms), normalizing them to a standard scale (e.g., between 0 and 1) ensures balanced contributions
to the model.

Applying these techniques helps ensure that your machine learning model is trained on clean, reliable
data, improving its performance and generalization to new data.

You might also like