Skip to content

A machine learning project for binary classification of skin cancer as malignant or benign, utilizing models like XGBoost, LGBM Classifier, Adaboost, SVM, and Logistic Regression. Features comprehensive data preprocessing, model training, and evaluation for accurate diagnosis.

License

Notifications You must be signed in to change notification settings

Awais-Asghar/Skin-Cancer-Binary-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Skin-Cancer-Binary-Classifier

Project Status Platform Environment Language License

A machine learning project for binary classification of skin cancer as malignant or benign, utilizing models like XGBoost, LGBM Classifier, Adaboost, SVM, and Logistic Regression. This pipeline involves preprocessing, visualization, modeling, and evaluation, making it a powerful diagnostic aid in dermatology.

Introduction

EDA & Preprocessing

Exploratory Data Analysis

  • Load and integrate image and metadata files. Image

  • Visualize sample images for quality checks.

Image

  • Count of Targets by Anatomical Site Image

  • Count of Targets by Age Image

  • Count of Targets by Sex Image

  • Analyze class distribution (malignant vs. benign). Benign 400666 Malignant 393

Image

  • Review metadata (patient age, lesion location, etc.). Image

Preprocessing

Image

Image

  • Data Cleaning: Remove duplicates, handle missing values.

  • Data Augmentation: Random rotations, flips, brightness/contrast variations. Image

  • Normalization: Scale pixel intensities, normalize image dimensions.

  • Resizing: Uniform image dimensions while preserving aspect ratio. Image


Modeling & Training

Models Used

  • XGBoost
  • AdaBoost
  • LightGBM (LGBMClassifier)
  • Support Vector Machine (SVM)
  • Logistic Regression
  • (Optional) Neural Network (for deep learning approach)

Image


Training Techniques

  • Hyperparameter tuning using Grid/Random Search.
  • K-Fold Cross-validation.
  • Model-specific optimizations (e.g., SVM kernels, NN architecture).

Image


Evaluation

  • Metrics: Accuracy, Precision, Recall, F1-Score, AUC.
  • Validation: Regular monitoring using a validation set.
  • Confusion Matrix: For visualizing classification errors.

Image


Tech Stack

Category Tool/Framework
Platform Kaggle
Notebook Jupyter Notebook
Language Python
Libraries scikit-learn, XGBoost, LightGBM, matplotlib, seaborn, OpenCV, NumPy, pandas

How to Run

  1. Go to the Kaggle Notebook using the link below:

https://www.kaggle.com/code/masharjavid/final-skin-cancer-binary-classifier

  1. Open the notebook and run all cells.
  2. Dataset is already uploaded in the Kaggle environment and linked within the notebook.

Results

  • Best AUC Score: 0.963
  • Highest Accuracy: 90.02%
  • Top Performing Model: LGBM-Classifier
  • Train loss: 0.1646
  • Validation loss: 0.1245
  • Recall: 0.876
  • Time taken: 60.94sec

Image

Image


Future Work

  • Web/Mobile app deployment with UI for diagnosis.
  • Explore larger datasets for improved generalization.

About

A machine learning project for binary classification of skin cancer as malignant or benign, utilizing models like XGBoost, LGBM Classifier, Adaboost, SVM, and Logistic Regression. Features comprehensive data preprocessing, model training, and evaluation for accurate diagnosis.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published