This project uses a Support Vector Machine (SVM) with a polynomial kernel to predict if a website visit led to a purchase. It analyzes session data and uses GridSearchCV to fine-tune the model for better accuracy.
- SVM Classifier with Polynomial Kernel: Used for classification of visitor sessions.
- Hyperparameter Tuning with GridSearchCV: Optimizes model parameters for better accuracy.
- Data Preprocessing Pipelines: Handles numerical and categorical feature transformations.
- Permutation Importance Analysis: Assesses feature impact on predictions.
- Model Evaluation & Visualization: Includes confusion matrix, learning curves, and feature distributions.
ecommerce_notebook.ipynb: Jupyter notebook containing the complete implementation.requirements.txt: List of dependencies required to run the project.
- Ensure you have Python installed.
- Install the required dependencies:
pip install -r requirements.txt
- Open and run
ecommerce_notebook.ipynbin Jupyter Notebook or execute the script in a Python environment.
- Import Necessary Libraries
- Load and Split the Data: Uses
train_test_splitfor splitting. - Preprocess Data: Uses pipelines for numerical and categorical features.
- Define Model and Hyperparameter Grid: Sets up SVM classifier with polynomial kernel.
- Train Model with GridSearchCV: Optimizes parameters through cross-validation.
- Evaluate Model on Test Set: Computes accuracy and feature importance.
- Visualizations:
- Feature distributions (numerical & categorical)
- Correlation heatmap
- Confusion matrix
- Learning curve for training sample sizes
- Best hyperparameters selected using GridSearchCV.
- Training and test accuracy scores.
- Feature importance ranking via permutation importance.
- Learning curve visualization showing model performance over different sample sizes.
This project provides a structured approach to classifying website visitor sessions using an SVM classifier with a polynomial kernel. Through hyperparameter tuning and feature importance analysis, it offers insights into the most relevant factors influencing purchase decisions.
Matthew Neba