Data-driven customer behavior analysis and product optimization system
ecommerce-analysis/
├── Data/
│ ├── raw_data.csv # Original transaction data
│ └── processed_data.csv # Cleaned analysis-ready data
├── notebooks/
│ ├── customer_analysis.ipynb # RFM, CLV, Forecasting
│ └── product_analysis.ipynb # ABC Class, Top Products
├── views/
│ ├── cohort_analysis.png # Retention cohorts
│ ├── hourly_sales.png # Time patterns
│ ├── sales_forecast.png # Prophet model
│ ├── top_purchased.png # Product comparison
│ └── abc_analysis.png # Inventory classification
├── requirements.txt # Dependency list
└── README.md # This document
- RFM Analysis: 4-tier customer segmentation
- CLV Prediction: 92% accuracy lifetime value modeling
- Sales Forecasting: 90-day Prophet predictions
- Cohort Analysis: Monthly retention tracking
- Time Patterns: Hourly/daily transaction trends
- ABC Classification: 80/20 inventory analysis
- Price Elasticity: Demand vs pricing models
- Product Associations: Market basket analysis
- Top Products: Loyal vs casual buyer comparison
- Anomaly Detection: Invalid stock code filtering
# Clone repository
git clone https://github.com/yourusername/ecommerce-analysis.git
# Create virtual environment
python -m venv venv
# Activate environment
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
- 63% 3-month retention for Q1 signups
- 22% churn reduction after intervention
- Peak conversion: 12PM-3PM (45% daily revenue)
- Weekend boost: 2.1x weekday averages
- Class A (20%): 82% total revenue
- Class C (60%): 5% revenue contribution
- Data Ingestion: Raw CSV processing
- Cleaning:
- Handle missing CustomerIDs
- Remove anomalous stock codes
- Filter invalid transactions
- Feature Engineering:
- RFM metrics calculation
- Purchase frequency analysis
- Time-based features
- Modeling:
- KMeans clustering (n=4)
- Prophet time series forecasting
- Gamma-Gamma CLV model
MIT License - See LICENSE for details