This notebook highlights some of the widely used supervised machine learning techniques applied i.e. Decision Trees, Logistic Regression, Random Forest and XGBoost for credit card fraud detection problems. In it you will find an experimental set up approach explaining the data that was used, limitations of the study, findings from exploratory data analysis, the data pre-processing steps taken and the resulting performance of the four different ML classifiers.
This research uses Vesta Corporation’s publicly available dataset; the IEEE-CIS Fraud Detection Dataset available on Kaggle. This data was collected by Vesta’s fraud protection system and digital security partners from different countries including North America, Latin America and Europe over a 6 month period from a reference date time that remains withheld.
An article that follows (pending peer review) will conclude this work and provides some suggestions for further studies