|
8 | 8 | "# End to end example on certifying local explanations using `Ecertify`\n",
|
9 | 9 | "\n",
|
10 | 10 | "In this notebook, we demonstrate how to _certify_ a local explanation for a prediction of a classification \n",
|
11 |
| - "model. Here we choose the popular tabular dataset FICO HELOC, and LIME and SHAP explainers for certification.\n", |
| 11 | + "model. Here we choose the popular tabular dataset [FICO HELOC](https://github.com/Trusted-AI/AIX360/blob/master/examples/tutorials/HELOC.ipynb), and use local explainers such as [LIME](https://github.com/marcotcr/lime) and [SHAP](https://github.com/shap/shap) for certification. We also comment\n", |
| 12 | + "on how the explainers can be compared on the basis of their (found) certification widths ($w$) at the end.\n", |
12 | 13 | "The cells below describe steps needed to perform certification in detail:\n",
|
13 |
| - "- obtaining a trained model on the dataset\n", |
14 |
| - "- selecting an instance of interest and computing its explanation\n", |
15 |
| - "- defining the quality criterion (here 1 - mean absolute error, i.e., the fidelity) to assess the degree to which this explanation is applicable to other instances\n", |
16 |
| - "- decide the fidelity threshold $\\theta$\n", |
17 |
| - "- certify the explanation, i.e., find the largest hypercube around the original instance where the computed explanation has sufficiently high fidelity $\\ge \\theta$" |
| 14 | + "\n", |
| 15 | + "1. obtaining a trained model on the dataset, here we use `GradientBoostingClassifier` from the sklearn library\n", |
| 16 | + "2. selecting an instance of interest and computing its explanation\n", |
| 17 | + "3. defining the quality criterion (here we use `1 - mean absolute error`, i.e., the fidelity as mentioned in the paper) to assess the degree to which the computed explanation is _applicable_ to other instances\n", |
| 18 | + "4. decide the fidelity threshold $\\theta$, this is another user configurable option\n", |
| 19 | + "5. certify the explanation, i.e., find the largest hypercube around the original instance where the computed explanation has sufficiently high fidelity $\\ge \\theta$" |
18 | 20 | ]
|
19 | 21 | },
|
20 | 22 | {
|
|
42 | 44 | "import numpy as np; np.set_printoptions(suppress=True)\n",
|
43 | 45 | "import pandas as pd\n",
|
44 | 46 | "\n",
|
| 47 | + "\n", |
| 48 | + "# sklearn utilities\n", |
45 | 49 | "from sklearn.model_selection import train_test_split\n",
|
46 | 50 | "from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor\n",
|
47 | 51 | "from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score, r2_score\n",
|
48 | 52 | "\n",
|
49 | 53 | "\n",
|
50 |
| - "\n", |
51 | 54 | "# explainers\n",
|
52 | 55 | "import shap, lime\n",
|
53 | 56 | "from lime.lime_tabular import LimeTabularExplainer\n",
|
54 | 57 | "\n",
|
55 | 58 | "# our code\n",
|
56 |
| - "from aix360.algorithms.ecertify.utils import load_fico_dataset, compute_lime_explainer, compute_shap_explainer" |
| 59 | + "from aix360.algorithms.ecertify.utils import load_fico_dataset, compute_lime_explainer, compute_shap_explainer\n", |
| 60 | + "# from algorithms.ecertify.utils import load_fico_dataset, compute_lime_explainer, compute_shap_explainer" |
57 | 61 | ]
|
58 | 62 | },
|
59 | 63 | {
|
|
107 | 111 | "id": "f8a8cb86",
|
108 | 112 | "metadata": {},
|
109 | 113 | "source": [
|
110 |
| - "## Prepare the callables for querying the model and the explanation during certification" |
| 114 | + "## Prepare the callables for querying the model and the explanation during certification\n", |
| 115 | + "The next cell prepares few callables: `_bb()` and `_e()` to query the `model` and the explanation. Note that\n", |
| 116 | + "we are assuming we have obtained a functional form of the computed explainer which can be _applied_ to another\n", |
| 117 | + "instance. In case of LIME, the explanation is a linear function with weights/coefficients set as the feature\n", |
| 118 | + "importance values in the explanation. For KernelSHAP, a similar linear function is used as well. We apply the\n", |
| 119 | + "function on an instance and get the correct class' (the original instance's class for which the explanation \n", |
| 120 | + "was computed) probability.\n", |
| 121 | + "\n", |
| 122 | + "Later when we have the `model` and the `expl_func` ready, we can wrap these functions(`_bb()` and `_e()`) with\n", |
| 123 | + "the `partial` _functool_ to hide the second argument (e.g., `expl_func` in `_e()`) and only pass the first argument `x`." |
111 | 124 | ]
|
112 | 125 | },
|
113 | 126 | {
|
114 | 127 | "cell_type": "code",
|
115 |
| - "execution_count": 4, |
| 128 | + "execution_count": 7, |
116 | 129 | "id": "dba4d6dc",
|
117 | 130 | "metadata": {},
|
118 | 131 | "outputs": [],
|
|
122 | 135 | "def _bb(x, model, label_x0=0, is_regression=False):\n",
|
123 | 136 | " \"\"\"\n",
|
124 | 137 | " x: single 1d numpy array of shape (d, )\n",
|
| 138 | + " label_x0: if classification, we need to take the correct class' probability\n", |
125 | 139 | " \"\"\"\n",
|
126 | 140 | " x = [x]\n",
|
127 | 141 | "\n",
|
|
146 | 160 | "id": "c2fe7c55",
|
147 | 161 | "metadata": {},
|
148 | 162 | "source": [
|
149 |
| - "## Choose a random sample for finding explanation and certification" |
| 163 | + "## Choose a random sample for finding explanation and certification\n", |
| 164 | + "Here we choose one prototype that we also discussed in the paper." |
150 | 165 | ]
|
151 | 166 | },
|
152 | 167 | {
|
153 | 168 | "cell_type": "code",
|
154 |
| - "execution_count": 5, |
| 169 | + "execution_count": 8, |
155 | 170 | "id": "0c1cbae2",
|
156 | 171 | "metadata": {},
|
157 | 172 | "outputs": [
|
|
183 | 198 | },
|
184 | 199 | {
|
185 | 200 | "cell_type": "code",
|
186 |
| - "execution_count": 6, |
| 201 | + "execution_count": 9, |
187 | 202 | "id": "0f832a6c",
|
188 | 203 | "metadata": {},
|
189 | 204 | "outputs": [
|
|
335 | 350 | "PercentTradesWBalance 81.0"
|
336 | 351 | ]
|
337 | 352 | },
|
338 |
| - "execution_count": 6, |
| 353 | + "execution_count": 9, |
339 | 354 | "metadata": {},
|
340 | 355 | "output_type": "execute_result"
|
341 | 356 | }
|
|
347 | 362 | },
|
348 | 363 | {
|
349 | 364 | "cell_type": "code",
|
350 |
| - "execution_count": 7, |
| 365 | + "execution_count": 11, |
351 | 366 | "id": "e5ab5b5c",
|
352 | 367 | "metadata": {},
|
353 | 368 | "outputs": [],
|
354 | 369 | "source": [
|
355 |
| - "# prepare the blackbox function for querying the model\n", |
| 370 | + "# prepare the blackbox function for querying the model, partial magic\n", |
356 | 371 | "bb = partial(_bb, model=model, label_x0=label_x0, is_regression=False)"
|
357 | 372 | ]
|
358 | 373 | },
|
|
364 | 379 | "## Check fidelity of explanations on the point `x0` itself\n",
|
365 | 380 | "\n",
|
366 | 381 | "Note that LIME explanations can have fidelity less than 1.0 on that instance, i.e., the linear/affine function \n",
|
367 |
| - "approximated the model at $x_0$ need not pass through the model's prediction for $x_0$. But for KernelSHAP, \n", |
368 |
| - "the approximating linear function always passes through model's prediction." |
| 382 | + "approximated the model at $x_0$ need not pass through the model's predicted probability for $x_0$. But for \n", |
| 383 | + "KernelSHAP, the approximating linear function always passes through model's predicted probability (this is\n", |
| 384 | + "also known as the efficiency criterion for shapley values, i.e., the sum of values should be equal to the \n", |
| 385 | + "total value)." |
369 | 386 | ]
|
370 | 387 | },
|
371 | 388 | {
|
372 | 389 | "cell_type": "code",
|
373 |
| - "execution_count": 8, |
| 390 | + "execution_count": 12, |
374 | 391 | "id": "1af94b94",
|
375 | 392 | "metadata": {},
|
376 | 393 | "outputs": [
|
|
427 | 444 | },
|
428 | 445 | {
|
429 | 446 | "cell_type": "code",
|
430 |
| - "execution_count": 9, |
| 447 | + "execution_count": 16, |
431 | 448 | "id": "f63775d5",
|
432 | 449 | "metadata": {},
|
433 | 450 | "outputs": [],
|
434 | 451 | "source": [
|
435 | 452 | "from aix360.algorithms.ecertify.ecertify import CertifyExplanation"
|
436 | 453 | ]
|
437 | 454 | },
|
| 455 | + { |
| 456 | + "cell_type": "markdown", |
| 457 | + "id": "7ba1e370", |
| 458 | + "metadata": {}, |
| 459 | + "source": [ |
| 460 | + "First we certify for the LIME explanation, we are computing again since in the last cell the variables were\n", |
| 461 | + "overwritten by the SHAP explanation." |
| 462 | + ] |
| 463 | + }, |
438 | 464 | {
|
439 | 465 | "cell_type": "code",
|
440 |
| - "execution_count": 10, |
| 466 | + "execution_count": 17, |
441 | 467 | "id": "1940a4f3",
|
442 | 468 | "metadata": {},
|
443 | 469 | "outputs": [],
|
444 | 470 | "source": [
|
445 |
| - "# do it for LIME explanation\n", |
446 | 471 | "func, expl = compute_lime_explainer(x_train, model, x0, num_features=len(x0))\n",
|
447 | 472 | "e = partial(_e, expl_func=func)\n",
|
448 | 473 | "f = lambda x: 1 - abs(bb(x) - e(x)) # fidelity function (specific to this explanation)"
|
|
458 | 483 | },
|
459 | 484 | {
|
460 | 485 | "cell_type": "code",
|
461 |
| - "execution_count": 11, |
| 486 | + "execution_count": 22, |
462 | 487 | "id": "47bc1168",
|
463 | 488 | "metadata": {},
|
464 | 489 | "outputs": [],
|
465 | 490 | "source": [
|
466 | 491 | "# Inputs to Certify()\n",
|
467 |
| - "theta = 0.75 # user desired fidelity threshold \n", |
| 492 | + "theta = 0.75 # user desired fidelity threshold \n", |
468 | 493 | "lb = 0; ub = 1 # init hypercube of size 1\n",
|
469 |
| - "Q = 10000 # query budget related arguments\n", |
470 |
| - "Z = 10 # number of halving/doubling iterations during certification (so total queries expensed = Z*Q)\n", |
471 |
| - "sigma0 = 0.1 # sigma for gaussians used in unifI and adaptI strategies\n", |
472 |
| - "NUMRUNS = 10 # consider running for more iterations here for reduced error\n", |
| 494 | + "Q = 10000 # query budget related arguments\n", |
| 495 | + "Z = 10 # number of halving/doubling iterations during certification (so total queries expensed = Z*Q)\n", |
| 496 | + "sigma0 = 0.1 # sigma for gaussians used in unifI and adaptI strategies\n", |
| 497 | + "NUMRUNS = 10 # consider running for more iterations here for reduced error\n", |
473 | 498 | "\n",
|
474 | 499 | "certifier = CertifyExplanation(theta=theta, Q=Q, Z=Z, lb=lb, ub=ub, sigma0=sigma0, numruns=NUMRUNS)"
|
475 | 500 | ]
|
476 | 501 | },
|
477 | 502 | {
|
478 | 503 | "cell_type": "code",
|
479 |
| - "execution_count": 12, |
| 504 | + "execution_count": 23, |
480 | 505 | "id": "4d988077",
|
481 | 506 | "metadata": {},
|
482 | 507 | "outputs": [
|
483 | 508 | {
|
484 | 509 | "name": "stdout",
|
485 | 510 | "output_type": "stream",
|
486 | 511 | "text": [
|
487 |
| - "Time per run: 5.247 s\n", |
488 |
| - "Found w: 0.3178 ± 0.142338\n" |
| 512 | + "Time per run: 6.844 s\n", |
| 513 | + "Found w: 0.4298 ± 0.047384\n" |
489 | 514 | ]
|
490 | 515 | }
|
491 | 516 | ],
|
|
502 | 527 | "metadata": {},
|
503 | 528 | "source": [
|
504 | 529 | "## Certification of SHAP\n",
|
505 |
| - "Similarly we could also certify the KernelSHAP explanation for the same instance, just using a different \n", |
506 |
| - "quality criterion." |
| 530 | + "Similarly we could also certify the KernelSHAP explanation for the same instance, just using a properly defined quality criterion that takes the SHAP `expl_func` object." |
507 | 531 | ]
|
508 | 532 | },
|
509 | 533 | {
|
510 | 534 | "cell_type": "code",
|
511 |
| - "execution_count": 13, |
| 535 | + "execution_count": 24, |
512 | 536 | "id": "fd418d56",
|
513 | 537 | "metadata": {},
|
514 | 538 | "outputs": [],
|
|
521 | 545 | },
|
522 | 546 | {
|
523 | 547 | "cell_type": "code",
|
524 |
| - "execution_count": 14, |
| 548 | + "execution_count": 25, |
525 | 549 | "id": "7a7dfbf4",
|
526 | 550 | "metadata": {},
|
527 | 551 | "outputs": [
|
528 | 552 | {
|
529 | 553 | "name": "stdout",
|
530 | 554 | "output_type": "stream",
|
531 | 555 | "text": [
|
532 |
| - "Time per run: 5.871 s\n", |
533 |
| - "Found w: 0.0416 ± 0.014154\n" |
| 556 | + "Time per run: 4.467 s\n", |
| 557 | + "Found w: 0.0272 ± 0.002681\n" |
534 | 558 | ]
|
535 | 559 | }
|
536 | 560 | ],
|
|
547 | 571 | "metadata": {},
|
548 | 572 | "source": [
|
549 | 573 | "## Observation\n",
|
550 |
| - "Note that for this instance, the LIME explanation has a larger certifier width ($\\approx 0.3$) than the \n", |
551 |
| - "KernelSHAP explanation ($\\approx 0.04$), implying the linear explanation obtained from LIME is applicable to a \n", |
552 |
| - "relatively large neighbourhood than the corresponding KernelSHAP explanation." |
| 574 | + "Note that for this instance, the LIME explanation has a larger certifier width ($\\approx 0.3-0.4$) than the \n", |
| 575 | + "KernelSHAP explanation ($\\approx 0.02-0.04$), implying the linear explanation obtained from LIME is applicable \n", |
| 576 | + "to a relatively large neighbourhood than the corresponding KernelSHAP explanation." |
553 | 577 | ]
|
554 | 578 | },
|
555 | 579 | {
|
|
0 commit comments