In this project I will show an example of how resampling can be useful for unbalanced datasets in binary classification problems. I'll be using a logistic regression model to demonstrate this. I'm aware that there are a vast amount of tools and libraries to deal with resampling and I strongly recommend that you use a combination of these methods to deal with unbalanced datasets. Here I would like to simply demonstrate the pros and cons of resampling using Sklearn's resample()
.
dataset: https://www.kaggle.com/datasets/yashpaloswal/fraud-detection-credit-card