Two-Stage Fraud Detection

XGBoost scores every transaction at scale, then a Google Gemini layer reasons over only the high-risk flags and writes an analyst-ready verdict.

Result: 0.978 AUC, 84% recall and 84% precision on the rare class (fraud is under 1% of the data). Stack: Python, XGBoost, Google Gemini, scikit-learn, pandas Live case study: https://pranavkaja.vercel.app/projects/fraud-triage Write-up: https://pranavkaja.vercel.app/blog/two-stage-fraud-detection

The problem

Most fraud systems hand the analyst a number. The model scores a transaction 0.87 and the analyst has to work out, from scratch, why it looks wrong and what to do. That is fine at low volume and brutal at scale, where card networks see millions of transactions a day, each one real money.

Approach

Two stages, each good at a different job.

Stage 1, XGBoost scores all 284,807 historical transactions with a fraud probability. Fast and cheap, so it scales to the full firehose and flags only the highest-risk handful.
Stage 2, a Gemini triage agent reads the raw feature values of each flagged transaction and writes a structured case file in a fixed JSON shape: verdict, confidence, primary signals, reasoning, and a recommended action.
A vanilla baseline stalled at 0.60 AUC. The cause was the data, not the algorithm: a 577-to-1 class imbalance drowned the loss, and 1,081 duplicate rows skewed it. scale_pos_weight=577 plus L2 regularization took AUC from 0.60 to 0.978.

Results

0.978 AUC, up from a 0.60 baseline.
In testing, the LLM layer caught a fraud the model had scored a perfect 0.000000, by reading the raw features instead of the model's confidence.
A rate-limited batch processor handles 100 transactions in about 10 minutes with caching, projecting to roughly $30/month at 100K transactions/day.

Run it

git clone https://github.com/PranavKaja/fraud-triage.git
cd fraud-triage
pip install -r requirements.txt
# set your Gemini API key (local: export it; Colab: add GEMINI_API_KEY to userdata)
export GEMINI_API_KEY=...
jupyter notebook   # open LLM_Fraud_Detection.ipynb

Built and run in Google Colab. The notebook reads the Gemini key from Colab userdata, so no key is hardcoded. Stage 1 trains on the public credit-card dataset (download creditcard.csv from Kaggle).

Notes

Trained on a public, anonymized credit-card dataset (284,807 transactions, features V1 to V28 are PCA components). No real cardholder data.

Part of my portfolio. Built by Pranav Kaja.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
LLM_Fraud_Detection.ipynb		LLM_Fraud_Detection.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Two-Stage Fraud Detection

The problem

Approach

Results

Run it

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Two-Stage Fraud Detection

The problem

Approach

Results

Run it

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages