ABSTRACT
ABSTRACT
Credit card fraud is a significant financial threat that affects millions of users worldwide,
leading to substantial economic losses. With the advancement of technology, fraudsters employ
sophisticated techniques such as phishing, malware, and social engineering to steal sensitive
credit card information. As a result, the development of effective fraud detection systems has
become crucial to mitigate risks and protect users. Machine learning-based methods, particularly
the Random Forest algorithm, have demonstrated high accuracy and efficiency in detecting
fraudulent transactions. This report explores the application of Random Forest in credit card
fraud detection, comparing different variations of the algorithm and analyzing their
effectiveness.
The fraud detection process begins with the collection of historical credit card transaction
data, which includes both legitimate and fraudulent transactions. Feature engineering is then
performed to extract key attributes such as transaction amount, time, location, merchant details,
and user behavior patterns. Two types of Random Forest classifiers are implemented: one
utilizing traditional decision trees as base classifiers and another incorporating more complex
base learners such as gradient-boosted trees. Both models are trained on labeled datasets to
distinguish between normal and fraudulent transactions.
Random Forest is an ensemble learning method that constructs multiple decision trees
and aggregates their predictions to improve classification accuracy. The algorithm randomly
selects subsets of data and features to train each tree, ensuring robustness against overfitting and
enhancing generalization. By analyzing the probability distribution of predictions across multiple
trees, the model can effectively detect fraudulent transactions with high precision and recall.