You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using TabPFN explainer with parallel computation (n_jobs > 1), the execution fails due to thread-unsafe dictionary operations in sampling.py. This makes the parallel computation feature described in the documentation unusable for TabPFN models.
The issue specifically occurs in the sampling.py implementation when trying to compute Shapley values for multiple instances in parallel. While the documentation suggests that parallel computation is supported through joblib (as shown in the parallel computation tutorial), this functionality does not work with TabPFN models due to thread safety issues.
Steps to Reproduce
Install required packages:
pipinstallshapiqtabpfn
Initialize TabPFN model and explainer:
fromtabpfn.scripts.transformer_prediction_interfaceimportTabPFNModelfromshapiqimportExplainerimportnumpyasnp# Prepare dataX_train=np.random.rand(100, 10)
X_test=np.random.rand(20, 10)
y_train=np.random.randint(0, 2, 100)
# Initialize model and explainertabpfn=TabPFNModel(device='cpu', N_ensemble_configurations=3)
explainer=Explainer(model=tabpfn, data=X_train)
Attempt to compute Shapley values in parallel:
# This will failexplanations=explainer.explain_X(X_test[:20], n_jobs=4)
Error Message
RuntimeError: dictionary changed size during iteration
The above exception was the direct cause of the following exception:
...
File "...shapiq/approximator/sampling.py", line 491, in execute_border_trick
for coalition in sampled_coalitions_dict:
RuntimeError: dictionary changed size during iteration
Detailed Analysis
Root Cause:
The error occurs in the execute_border_trick method in sampling.py
The implementation attempts to modify a dictionary while iterating over it
This operation is not thread-safe and fails in parallel execution
Current Limitations:
Parallel computation (n_jobs > 1) cannot be used with TabPFN explainer
The only working solution is to use n_jobs=1
This significantly impacts performance when explaining multiple instances
Documentation Gap:
The parallel computation tutorial suggests this feature works for all models
There is no mention of TabPFN-specific limitations
Users might waste time trying to debug parallel computation issues
Impact
Performance degradation due to forced sequential computation
Inconsistency between documentation and actual functionality
Poor user experience when trying to use parallel features with TabPFN
Environment
shapiq version: 1.2.0
Python version: 3.8
Operating System: Windows 10
TabPFN version: latest
The text was updated successfully, but these errors were encountered:
Description
When using TabPFN explainer with parallel computation (n_jobs > 1), the execution fails due to thread-unsafe dictionary operations in sampling.py. This makes the parallel computation feature described in the documentation unusable for TabPFN models.
The issue specifically occurs in the sampling.py implementation when trying to compute Shapley values for multiple instances in parallel. While the documentation suggests that parallel computation is supported through joblib (as shown in the parallel computation tutorial), this functionality does not work with TabPFN models due to thread safety issues.
Steps to Reproduce
Error Message
Detailed Analysis
Root Cause:
Current Limitations:
Documentation Gap:
Impact
Environment
The text was updated successfully, but these errors were encountered: