Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow to pass multiple documents to explain_prediction #225

Open
kmike opened this issue Jun 23, 2017 · 1 comment
Open

allow to pass multiple documents to explain_prediction #225

kmike opened this issue Jun 23, 2017 · 1 comment

Comments

@kmike
Copy link
Contributor

kmike commented Jun 23, 2017

Currently we allow to pass a single document to explain_prediction. I think we should allow to pass multiple documents as well.

There are two ways to handle multiple documents; both of them look useful.

  1. just show explanations for all instances - something similar to show_prediction() doesn't work inside a loop #205.

  2. aggregate resulting explanations in some way. For example, https://github.com/andosa/treeinterpreter has this feature - you can check which features are important for predicting labels for a part of dataset. I'm not sure we should show all individual documents in this case; probably there should be an option to turn it off (or maybe there is already an option - can we turn it off using show argument?).

It may also help with #213 (comment).

@lopuhin
Copy link
Contributor

lopuhin commented Aug 11, 2017

I wrote a PoC for (2) outside of eli5, targeting only XGBoost. It worked much faster than calling explain_prediction multiple times and then aggregating the results. The only caveat is that I needed to do special handling of missing values: xgboost might use some positive feature and if it's missing, will make the score lower. The weight for this missing positive feature would be negative in this case - so I excluded them from aggregation and the feature weights started to make sense. But it might be that this handling is not always desired. This approach also requires custom code in each classifier we want to support, although it's not a huge change and one document can be seen as a special case of multiple documents here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants