Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check how explain_weights works on regression problems #175

Open
lopuhin opened this issue Apr 27, 2017 · 3 comments
Open

Check how explain_weights works on regression problems #175

lopuhin opened this issue Apr 27, 2017 · 3 comments

Comments

@lopuhin
Copy link
Contributor

lopuhin commented Apr 27, 2017

More of a note to self - I'll expand it into something more reproducible or close:

  • explain_weights for xgboost regressor does not have a bias - is it possible to recover it, or some other indication of how good are the features vs. just predicting the mean? Maybe it applies to classification problems as well?
  • explain_weights for Lasso looks suspicious on the diabets + leak dataset - the leak does not stand out among other features, but it does for xgboost (need to check the model first).
@kmike
Copy link
Contributor

kmike commented May 3, 2017

Hm, I'm not sure bias makes sense for explain_weights + xgboost regressor; regressor predicts values regardless of mean; GBMs can handle shifts in data without any special handling, there is no need to account for bias explicitly. I haven't seen feature importances for "bias" in decison trees or ensembles.

But maybe I'm wrong and it is possible to introduce some notion of bias which makes sense. For example, in LIghtGBM the first iteration is a synthetic tree which always predicts bias; while not required, as I understand, it helps with convergence in practice. So maybe the way to look at it is to compare first tree and next trees in the ensemble, or maybe check several of the low-iteration trees; this is a more general approach which is not specific to bias. I haven't tried it, but it may work.

@alzmcr
Copy link

alzmcr commented Mar 5, 2019

@lopuhin in regards of the BIAS, I wanted to comment here but then noticed the issue is closed.
Reading the post in the blog you've linked that comment my understanding is that BIAS is the starting point, then every shift in from is how feature importance is define for each prediction.

If BIAS ends up being the most "relevant" feature explained, doesn't it mean that the shift from the mean in the path taken is minimal and none of the feature play a key role? In other words, no feature is a discriminator for this prediction, it's just the expected value?

@lopuhin
Copy link
Contributor Author

lopuhin commented Mar 6, 2019

@alzmcr yes, I think your interpretation is fair 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants