|
1 | 1 | # xbooster 🚀 |
2 | 2 |
|
3 | | -A scorecard-format famework for logistic regression tasks with gradient-boosted decision trees (XGBoost and CatBoost). |
| 3 | +A scorecard-format framework for logistic regression tasks with gradient-boosted decision trees (XGBoost, LightGBM, and CatBoost). |
4 | 4 | xbooster allows to convert a classification model into a logarithmic (point) scoring system. |
5 | 5 |
|
6 | 6 | In addition, it provides a suite of interpretability tools to understand the model's behavior. |
@@ -218,6 +218,89 @@ The `DataPreprocessor` provides: |
218 | 218 | 3. Generation of interaction constraints for XGBoost |
219 | 219 | 4. Consistent feature naming for scorecard generation |
220 | 220 |
|
| 221 | +### LightGBM Support 💡 (Release Candidate) |
| 222 | + |
| 223 | +xbooster provides support for LightGBM models with scorecard functionality. Here's how to use it: |
| 224 | + |
| 225 | +```python |
| 226 | +import pandas as pd |
| 227 | +import lightgbm as lgb |
| 228 | +from xbooster.constructor import LGBScorecardConstructor |
| 229 | +from sklearn.model_selection import train_test_split |
| 230 | +from sklearn.metrics import roc_auc_score |
| 231 | + |
| 232 | +# Load data |
| 233 | +url = "https://github.com/xRiskLab/xBooster/raw/main/examples/data/credit_data.parquet" |
| 234 | +dataset = pd.read_parquet(url) |
| 235 | + |
| 236 | +features = [ |
| 237 | + "external_risk_estimate", |
| 238 | + "revolving_utilization_of_unsecured_lines", |
| 239 | + "account_never_delinq_percent", |
| 240 | + "net_fraction_revolving_burden", |
| 241 | + "num_total_cc_accounts", |
| 242 | + "average_months_in_file", |
| 243 | +] |
| 244 | + |
| 245 | +target = "is_bad" |
| 246 | +X, y = dataset[features], dataset[target] |
| 247 | + |
| 248 | +X_train, X_test, y_train, y_test = train_test_split( |
| 249 | + X, y, test_size=0.3, random_state=62, stratify=y |
| 250 | +) |
| 251 | + |
| 252 | +# Train LightGBM model |
| 253 | +model = lgb.LGBMClassifier( |
| 254 | + n_estimators=50, |
| 255 | + learning_rate=0.55, |
| 256 | + max_depth=1, |
| 257 | + num_leaves=2, |
| 258 | + min_child_samples=10, |
| 259 | + random_state=62, |
| 260 | + verbose=-1, |
| 261 | +) |
| 262 | +model.fit(X_train, y_train) |
| 263 | + |
| 264 | +# Initialize LGBScorecardConstructor |
| 265 | +constructor = LGBScorecardConstructor(model, X_train, y_train) |
| 266 | + |
| 267 | +# Construct scorecard |
| 268 | +scorecard = constructor.construct_scorecard() |
| 269 | +print(scorecard.head()) |
| 270 | + |
| 271 | +# Create points with base score normalization (default) |
| 272 | +scorecard_with_points = constructor.create_points( |
| 273 | + pdo=50, |
| 274 | + target_points=600, |
| 275 | + target_odds=19, |
| 276 | + precision_points=0, |
| 277 | + use_base_score=True # Ensures proper tree contribution balancing |
| 278 | +) |
| 279 | + |
| 280 | +# Make predictions |
| 281 | +credit_scores = constructor.predict_score(X_test) |
| 282 | + |
| 283 | +# Calculate Gini |
| 284 | +gini = roc_auc_score(y_test, -credit_scores) * 2 - 1 |
| 285 | +print(f"Scorecard Gini: {gini:.4f}") |
| 286 | + |
| 287 | +# Compare with model predictions |
| 288 | +model_gini = roc_auc_score(y_test, model.predict_proba(X_test)[:, 1]) * 2 - 1 |
| 289 | +print(f"Model Gini: {model_gini:.4f}") |
| 290 | +``` |
| 291 | + |
| 292 | +**Key Features:** |
| 293 | +- **Scorecard Construction**: Implementation of `create_points()` and `predict_score()` |
| 294 | +- **Base Score Normalization**: Proper handling of LightGBM's base score for balanced tree contributions |
| 295 | +- **High Discrimination**: Scorecard Gini closely matches model Gini |
| 296 | +- **Flexible**: `use_base_score` parameter for optional base score normalization |
| 297 | + |
| 298 | +**Important Notes:** |
| 299 | +- **Release Candidate**: This feature is in testing phase - feedback welcome! |
| 300 | +- LightGBM's sklearn API handles base_score differently than XGBoost |
| 301 | +- The `use_base_score=True` parameter (default) ensures proper normalization |
| 302 | +- Only `XAddEvidence` score type is supported (WOE not applicable) |
| 303 | + |
221 | 304 | ### CatBoost Support 🐱 (Beta) |
222 | 305 |
|
223 | 306 | xbooster provides experimental support for CatBoost models with reduced functionality compared to XGBoost. Here's how to use it: |
|
0 commit comments