diff --git a/_images/2a2607fbc74903709fb433c13fa4f44a864c0067336c34ef3ff01e4ddccbfd7b.png b/_images/2a2607fbc74903709fb433c13fa4f44a864c0067336c34ef3ff01e4ddccbfd7b.png deleted file mode 100644 index 6f727f9..0000000 Binary files a/_images/2a2607fbc74903709fb433c13fa4f44a864c0067336c34ef3ff01e4ddccbfd7b.png and /dev/null differ diff --git a/_images/5dd4146ebf31fbe53c566ab5a091ff1d1f709d050d2aefc911ebec8a74b404e3.png b/_images/5dd4146ebf31fbe53c566ab5a091ff1d1f709d050d2aefc911ebec8a74b404e3.png new file mode 100644 index 0000000..8b041c2 Binary files /dev/null and b/_images/5dd4146ebf31fbe53c566ab5a091ff1d1f709d050d2aefc911ebec8a74b404e3.png differ diff --git a/_images/986d3355a04a331a4f82450cdb3d395e2e4c3f0e6bc69ce6ee11d573526c9fb5.png b/_images/986d3355a04a331a4f82450cdb3d395e2e4c3f0e6bc69ce6ee11d573526c9fb5.png new file mode 100644 index 0000000..63eedb5 Binary files /dev/null and b/_images/986d3355a04a331a4f82450cdb3d395e2e4c3f0e6bc69ce6ee11d573526c9fb5.png differ diff --git a/_images/aecbb96219de7dc8c19cf2a447045bd58e83dd8a764f2eeda59c9c191f8d47b8.png b/_images/aecbb96219de7dc8c19cf2a447045bd58e83dd8a764f2eeda59c9c191f8d47b8.png deleted file mode 100644 index 8497fbc..0000000 Binary files a/_images/aecbb96219de7dc8c19cf2a447045bd58e83dd8a764f2eeda59c9c191f8d47b8.png and /dev/null differ diff --git a/_sources/notebooks/5-interpretability.ipynb b/_sources/notebooks/5-interpretability.ipynb index 18d2750..5ad92c5 100644 --- a/_sources/notebooks/5-interpretability.ipynb +++ b/_sources/notebooks/5-interpretability.ipynb @@ -383,13 +383,11 @@ "\n", "We have the 3 features and how varying these changes the impact in predicting a specific class.\n", "\n", - "Interestingly, we can see that the Culmen length for [A] is smaller, because larger values reduce the partial dependence , [B] however seems to have a larger Culmen length and [C] is almost unaffected by this feature!\n", + "Interestingly, we can see that the Culmen length for Adelie is smaller, because larger values reduce the partial dependence, Chinstrap penguins however seem to have a larger Culmen length, and Gentoo is almost unaffected by this feature!\n", "\n", - "Similarly only [C] seems to have larger Flippers, whereas smaller flippers have a lower partial dependence for large values.\n", + "Similarly only Gentoo seems to have larger Flippers, whereas smaller flippers have a lower partial dependence for large values.\n", "\n", - "I'm not a penguin expert, I just find them adorable, and I'm able to glean this interpretable information from the plots.\n", - "\n", - "I think is a great tool!" + "I'm not a penguin expert, I just find them adorable, and I'm able to glean this interpretable information from the plots. I think is a great tool! 🐧" ] }, { @@ -397,7 +395,21 @@ "metadata": {}, "source": [ "### Feature importances with Tree importance vs Permutation importance\n", - "\n" + "\n", + "Understanding feature importance is crucial in machine learning, as it helps us identify which features have the most significant impact on model predictions. \n", + "\n", + "Two standard methods for assessing feature importance are Tree Importance and Permutation Importance.\n", + "Tree Importance, usually associated with tree-based models like random forests, calculates feature importances based on how frequently a feature is used to split nodes in the trees. It's a counting exercise.\n", + "\n", + "Features frequently selected for splitting are considered more important because they contribute more to the model's predictive performance. One benefit of Tree Importance is its computational efficiency, as feature importance can be readily obtained by training. However, Tree Importance may overestimate the importance of correlated features, features with high cardinality and randomness, and features that struggle with feature interactions.\n", + "\n", + "On the other hand, Permutation Importance assesses feature importance by measuring the decrease in model performance when the values of a feature are randomly shuffled. Features that, when shuffled, lead to a significant decrease in model performance are deemed more important. Permutation Importance is model-agnostic and can be applied to any type of model, making it versatile and applicable in various scenarios. Additionally, Permutation Importance accounts for feature interactions and is less biased by correlated features. However, it is computationally more expensive, especially for models with large numbers of features or complex interactions.\n", + "\n", + "People are interested in feature importances for several reasons. Firstly, feature importances provide insights into the underlying relationships between features and the target variable, aiding in feature selection and dimensionality reduction. \n", + "\n", + "Moreover, understanding feature importances helps researchers and practitioners interpret model predictions and identify potential areas for improvement or further investigation. Feature importances can also inform domain experts and stakeholders about which features are driving model decisions, enhancing transparency and trust in machine learning systems.\n", + "\n", + "We'll start out by training a different type of model in this section, a standard Random Forest. Then we can directly compare the tree-based feature importnace with permutation importances. The data split from [the Data notebook](/notebooks/0-basic-data-prep-and-model.html) we established earlier remains the same and the pre-processing is also the same, despite Random Forests dealing with non-normalised data well." ] }, { @@ -437,13 +449,22 @@ "\n", "rf = Pipeline(steps=[\n", " ('preprocessor', preprocessor),\n", - " ('classifier', RandomForestClassifier()),\n", + " ('classifier', RandomForestClassifier(random_state=42)),\n", "])\n", "\n", "rf.fit(X_train, y_train)\n", "rf.score(X_test, y_test)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can simply plot the feature importances obtained from training the model.\n", + "\n", + "These will always be slightly different, due to the training process of Random Forests on randomly selected subsets of the data." + ] + }, { "cell_type": "code", "execution_count": 10, @@ -473,6 +494,11 @@ "plt.show()" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + }, { "cell_type": "code", "execution_count": 11, diff --git a/notebooks/0-basic-data-prep-and-model.html b/notebooks/0-basic-data-prep-and-model.html index f7b85a4..b69f323 100644 --- a/notebooks/0-basic-data-prep-and-model.html +++ b/notebooks/0-basic-data-prep-and-model.html @@ -984,39 +984,39 @@
0.9914163090128756
+0.9871244635193133
1.0
+0.9900990099009901
{'fit_time': array([0.00643802, 0.0052402 , 0.00526881, 0.00526285, 0.0052402 ]),
- 'score_time': array([0.00420523, 0.0040257 , 0.00399923, 0.00399208, 0.0042634 ]),
+{'fit_time': array([0.00581956, 0.00523829, 0.00519466, 0.00517082, 0.00550675]),
+ 'score_time': array([0.00410533, 0.00397706, 0.00397515, 0.00400424, 0.00418091]),
'test_MCC': array([0.37796447, 0.27863911, 0.40824829, 0.02424643, 0.08625819]),
'test_ACC': array([0.73333333, 0.7 , 0.76666667, 0.66666667, 0.62068966])}
diff --git a/notebooks/5-interpretability.html b/notebooks/5-interpretability.html
index 10005a0..7c5f853 100644
--- a/notebooks/5-interpretability.html
+++ b/notebooks/5-interpretability.html
@@ -612,67 +612,61 @@ 5.3.1. Partial Dependence for Machine Le
dict_keys(['grid_values', 'values', 'average'])
-Example Values: [37. 37.7 39.3 40.1 40.6 41.1 41.7 44.1 45.2 45.5 45.6 45.7 45.8 47.6
- 49. 50.2 50.5 51.3 52.7], Average: [[0.63714511 0.5385696 0.42161577 0.39381485 0.36065106 0.3238969
- 0.26682915 0.20091536 0.19660165 0.18819461 0.15634479 0.17994034
- 0.17925519 0.17867114]
- [0.66318481 0.59388464 0.47709496 0.42037882 0.39250963 0.3267896
- 0.32231986 0.20354233 0.19911826 0.19050522 0.15695345 0.17994863
- 0.17914305 0.17824962]
- [0.66562535 0.62561145 0.4800818 0.45236882 0.4192684 0.32990531
- 0.32540806 0.23015716 0.20192886 0.19308603 0.21046148 0.20383814
- 0.20289978 0.20166824]
- [0.84718512 0.63752568 0.62712499 0.59446692 0.5669389 0.44405449
- 0.41597796 0.34951325 0.34475954 0.28255543 0.21691291 0.20687517
- 0.20524067 0.20228448]
- [0.93972872 0.69121058 0.65761143 0.6540416 0.65035044 0.53323133
- 0.47636721 0.35760839 0.35286468 0.31948567 0.22185888 0.20994723
- 0.20782948 0.20370368]
- [1.00059758 0.72339829 0.66107451 0.65759595 0.65397513 0.56078326
- 0.53294581 0.41441573 0.35709513 0.34739729 0.2247428 0.21189619
- 0.20953764 0.2047238 ]
- [1.00366195 0.72670267 0.66457261 0.66117499 0.65764569 0.59361288
- 0.53692337 0.4186242 0.39032028 0.38072077 0.22792981 0.21411943
- 0.21150941 0.20596682]
- [1.50900785 1.47183047 1.41302442 1.41074092 1.35570081 1.24235242
- 1.23937357 1.17689039 1.14954414 1.11252866 0.74548389 0.54670843
- 0.51222928 0.38544081]
- [1.62434096 1.61664115 1.61046548 1.60819089 1.60582263 1.59809884
- 1.59528701 1.50455142 1.47758459 1.38900603 1.00265443 0.73356448
- 0.69871695 0.65742776]
- [1.62980239 1.62237432 1.61615296 1.61384879 1.61143257 1.6036396
- 1.6008404 1.53904675 1.53574034 1.47614674 1.08085006 0.79284749
- 0.73418057 0.68750833]
- [1.63943184 1.63287989 1.6266261 1.62424331 1.62175122 1.61364826
- 1.61076178 1.6015663 1.59828153 1.56759501 1.11613925 1.03936766
- 0.93827262 0.70424585]
- [1.6435549 1.63750425 1.63138763 1.62898639 1.62644408 1.61817501
- 1.61521615 1.60585824 1.60255746 1.59559347 1.14975076 1.04484802
- 1.03839281 0.76193802]
- [1.64545337 1.63960981 1.63362627 1.63122595 1.62866951 1.62030598
- 1.6173228 1.60785976 1.60454014 1.59757005 1.17553019 1.0763064
- 1.01722941 0.8800954 ]
- [1.64724805 1.64158374 1.63574989 1.63336845 1.63080655 1.62235007
- 1.61933771 1.60976925 1.60642479 1.59943461 1.17748052 1.07864503
- 1.04856445 0.90604118]
- [1.64894198 1.64343268 1.63773688 1.63539748 1.63284621 1.62430887
- 1.62126213 1.61158878 1.60821436 1.60119202 1.20824135 1.08080384
- 1.05080481 1.00293375]
- [1.68238854 1.64828327 1.64288393 1.64067212 1.63823839 1.62966681
- 1.62650424 1.61651791 1.61303065 1.60585638 1.2892403 1.11521733
- 1.10921478 0.98507371]
- [1.68491542 1.65097705 1.64569885 1.64354239 1.64117543 1.63275318
- 1.62954734 1.61934748 1.61578909 1.60848271 1.34441304 1.14180502
- 1.1359731 1.04101396]
- [1.68603796 1.65217043 1.64693599 1.64479968 1.64245743 1.63411697
- 1.63092122 1.62061868 1.6170318 1.6096552 1.34552126 1.17200642
- 1.13729958 1.01354773]
- [1.68877221 1.68410174 1.6500044 1.64790662 1.64561287 1.63748311
- 1.63433761 1.62386218 1.62018391 1.61261818 1.37718115 1.25136991
- 1.19842054 1.04626389]
- [1.71482178 1.68753488 1.68264896 1.68058937 1.64939092 1.64149842
- 1.63847645 1.62818187 1.6243867 1.61647898 1.38036652 1.31300013
- 1.22616058 1.18594288]]
+Example Values: [36.2 37. 37.3 37.7 38.8 40.8 41.1 42.1 42.3 43.2 43.5 45.6 46.2 46.7
+ 46.8 49.1 50. 50.5 50.7 51.5], Average: [[0.90177069 0.78671704 0.70469108 0.56801047 0.56387708 0.55967227
+ 0.55107687 0.52167929 0.37392551 0.33493436 0.29741053 0.29468135
+ 0.29041098 0.28760599 0.28612207 0.26249271]
+ [0.95854922 0.84406229 0.78728123 0.67601119 0.64692182 0.6176926
+ 0.60894839 0.60450488 0.45647307 0.34221175 0.30205558 0.29872462
+ 0.29333685 0.28960929 0.28738673 0.28743635]
+ [1.01551367 0.92684212 0.87024088 0.75932672 0.73040366 0.72631957
+ 0.64263557 0.63808699 0.53973778 0.40017748 0.30797264 0.30409229
+ 0.29746883 0.29261721 0.28948639 0.28796838]
+ [1.21543358 1.15454241 1.04923128 0.99011567 0.98671552 0.98310429
+ 0.87532583 0.87120056 0.64807017 0.60751218 0.33510354 0.33014945
+ 0.3205865 0.31189322 0.30470411 0.29497534]
+ [1.24402018 1.18345138 1.07825509 0.99439123 0.99110542 0.98762567
+ 0.90504265 0.87597612 0.65319315 0.63762267 0.33948467 0.3343985
+ 0.32463604 0.31556777 0.30788518 0.29670303]
+ [1.46518415 1.25675279 1.22731435 1.21935174 1.1913854 1.11326505
+ 1.03165386 1.00314301 0.88281229 0.76848444 0.49282031 0.43715282
+ 0.35095601 0.34009502 0.33025273 0.31129916]
+ [1.54706175 1.33922813 1.31009225 1.22762444 1.22474917 1.22169955
+ 1.09012529 1.06164286 0.96704899 0.82799127 0.5776392 0.5217532
+ 0.41022477 0.34899575 0.338429 0.31750116]
+ [1.56949432 1.56359638 1.56033575 1.47925381 1.40187371 1.34932349
+ 1.29368981 1.26557547 1.17175739 1.0084355 0.7110423 0.6549509
+ 0.64240401 0.58020699 0.46824794 0.34180036]
+ [1.5723779 1.56667319 1.56353138 1.50770101 1.48040632 1.40291521
+ 1.32237702 1.29433087 1.17580618 1.03746466 0.74033273 0.70937585
+ 0.64687068 0.60948593 0.52245194 0.34540078]
+ [1.61327044 1.60801497 1.60519383 1.60013854 1.59820991 1.5961448
+ 1.59155638 1.5640031 1.29780848 1.23512441 0.96313622 0.93238057
+ 0.8207369 0.70862276 0.64601296 0.44171237]
+ [1.61572641 1.61045503 1.60763686 1.60261566 1.60070985 1.59867496
+ 1.59417395 1.5916806 1.37600032 1.23848599 0.99154717 0.93578386
+ 0.84914562 0.73716793 0.67464783 0.4950705 ]
+ [1.73328051 1.65290729 1.64986218 1.64448946 1.64252114 1.6404623
+ 1.61105659 1.60869728 1.56992278 1.50949592 1.21442115 1.15868685
+ 1.04702336 0.96022004 0.9234437 0.64376455]
+ [1.73519938 1.67982643 1.6517578 1.64630121 1.64430186 1.64221505
+ 1.63776544 1.61039161 1.57164816 1.51135617 1.21671527 1.21098726
+ 1.07432268 0.98753812 0.9507733 0.67126142]
+ [1.7637881 1.73347284 1.68035391 1.64975343 1.64768095 1.64552627
+ 1.64096246 1.638545 1.57477148 1.5396688 1.22093739 1.21519319
+ 1.12855512 1.01680176 0.93005867 0.75092701]
+ [1.76995852 1.73979293 1.73663613 1.70588806 1.67869752 1.65139213
+ 1.64651792 1.64396753 1.57984487 1.54487162 1.30285845 1.22211976
+ 1.18553287 1.14884767 1.03720694 0.88385137]
+ [1.77368698 1.74365561 1.74045745 1.7096149 1.70740263 1.68005246
+ 1.64995792 1.64728185 1.58272349 1.54769294 1.35665257 1.32599698
+ 1.21445766 1.15288791 1.14136026 0.88856036]
+ [1.77647515 1.77170323 1.74349877 1.71253491 1.71028273 1.70790049
+ 1.67772755 1.64993363 1.58485951 1.54968728 1.43434804 1.40375491
+ 1.2673429 1.20591059 1.14452556 0.8923069 ]
+ [1.79945725 1.77243788 1.76985703 1.7392254 1.73695508 1.70949457
+ 1.70407438 1.70116209 1.58464611 1.5485844 1.4341855 1.42907144
+ 1.41867805 1.38335043 1.32327375 1.05009869]]
@@ -717,12 +711,19 @@ 5.3.1. Partial Dependence for Machine Le
These plots can be very insightful, if you know how to interpret them correctly.
We have the 3 features and how varying these changes the impact in predicting a specific class.
-Interestingly, we can see that the Culmen length for [A] is smaller, because larger values reduce the partial dependence , [B] however seems to have a larger Culmen length and [C] is almost unaffected by this feature!
-Similarly only [C] seems to have larger Flippers, whereas smaller flippers have a lower partial dependence for large values.
-I’m not a penguin expert, I just find them adorable, and I’m able to glean this interpretable information from the plots.
-I think is a great tool!
+Interestingly, we can see that the Culmen length for Adelie is smaller, because larger values reduce the partial dependence, Chinstrap penguins however seem to have a larger Culmen length, and Gentoo is almost unaffected by this feature!
+Similarly only Gentoo seems to have larger Flippers, whereas smaller flippers have a lower partial dependence for large values.
+I’m not a penguin expert, I just find them adorable, and I’m able to glean this interpretable information from the plots. I think is a great tool! 🐧
Understanding feature importance is crucial in machine learning, as it helps us identify which features have the most significant impact on model predictions.
+Two standard methods for assessing feature importance are Tree Importance and Permutation Importance. +Tree Importance, usually associated with tree-based models like random forests, calculates feature importances based on how frequently a feature is used to split nodes in the trees. It’s a counting exercise.
+Features frequently selected for splitting are considered more important because they contribute more to the model’s predictive performance. One benefit of Tree Importance is its computational efficiency, as feature importance can be readily obtained by training. However, Tree Importance may overestimate the importance of correlated features, features with high cardinality and randomness, and features that struggle with feature interactions.
+On the other hand, Permutation Importance assesses feature importance by measuring the decrease in model performance when the values of a feature are randomly shuffled. Features that, when shuffled, lead to a significant decrease in model performance are deemed more important. Permutation Importance is model-agnostic and can be applied to any type of model, making it versatile and applicable in various scenarios. Additionally, Permutation Importance accounts for feature interactions and is less biased by correlated features. However, it is computationally more expensive, especially for models with large numbers of features or complex interactions.
+People are interested in feature importances for several reasons. Firstly, feature importances provide insights into the underlying relationships between features and the target variable, aiding in feature selection and dimensionality reduction.
+Moreover, understanding feature importances helps researchers and practitioners interpret model predictions and identify potential areas for improvement or further investigation. Feature importances can also inform domain experts and stakeholders about which features are driving model decisions, enhancing transparency and trust in machine learning systems.
+We’ll start out by training a different type of model in this section, a standard Random Forest. Then we can directly compare the tree-based feature importnace with permutation importances. The data split from the Data notebook we established earlier remains the same and the pre-processing is also the same, despite Random Forests dealing with non-normalised data well.
from sklearn.ensemble import RandomForestClassifier
@@ -737,7 +738,7 @@ 5.3.1.1. Feature importances with Tree i
rf = Pipeline(steps=[
('preprocessor', preprocessor),
- ('classifier', RandomForestClassifier()),
+ ('classifier', RandomForestClassifier(random_state=42)),
])
rf.fit(X_train, y_train)
@@ -751,6 +752,8 @@ 5.3.1.1. Feature importances with Tree i
Now we can simply plot the feature importances obtained from training the model.
+These will always be slightly different, due to the training process of Random Forests on randomly selected subsets of the data.
pd.Series(rf.named_steps["classifier"].feature_importances_, index=num_features+['F', 'M']).plot.bar()
@@ -759,7 +762,7 @@ 5.3.1.1. Feature importances with Tree i
<shap.explainers._tree.TreeExplainer at 0x7f7ad56eb610>
+<shap.explainers._tree.TreeExplainer at 0x7f7772b66eb0>