InputIBA: Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

This repository is the official implementation of our paper accepted in NeurIPS 2021.

We propose an attribution method called InputIBA to have input-level explanation by leveraging a information-botleneck on latent layer and a GAN to fit distributions. For details of the method please refer to our paper. Other information can be found from the project's homepage.

The method results to fine-grained attribution map, which is directly optimized on the input, so the attribution has the resolution of input and can provide more details. From the example below, the generated attribution map is directly reflecting regions of interest for NN model's decision, and other similar features (like coins in the image) are ruled out.

Image Mask

Moreover, our method released some assumptions of the previous method, resulting to our method being model-agnostic. We demostrated this model-agnostic ability on both vision and NLP tasks, e.g. recurrent neural network and convolutional neural network.

Example Results

Here is an example of attribution maps produced by various attribution methods. By inspection, we can see that the attribution map of our method is much more fine-grained than other explanation methods.

Another example of identifing informative tokens (words & symbols). Our method has highlighed important features, and the result is more interpretable to humans compare to other methods.

Requirements

Install torch and torchvision (and torchtext for NLP tasks) following the official instructions of pytorch
Install mmcv or mmcv-full following the official instructions of mmcv.
Since our code only uses limited features from MMCV, a lite version can be simply installed with pip install mmcv
Install additional requirements with pip install -r requirements.txt.
Install the package in develop mode: python setup.py develop.

Run Attribution

Jupytor Notebook as hands-on tutorial

We provide two jupyter notebooks for NLP and Computer Vision task under tutorials/, the tutorial notebooks provide a interactive way for showing how to run attribution with InputIBA on single sample.

Two jupyter notebooks are here for vision task and here for NLP task .

Batch-wise attribution generation

The below scripts works for batch generation of attribution.

Computer Vision Task: Image Classification

Download ImageNet validation set. Format the sets to torchvison.dataset.ImageFolder style if necessary. Use this script to generate two small sets: estimation set and attribution set. The estimation set is for estimating the mean and standard deviation of hidden features, while the attribution set consists of images for the neural network to explain. Cop this json file to the dataset root. The dataset should have following structure:
```
.
|-- annotations
|   `-- attribution
|   |   |-- n01440764
|   |   |-- n01443537
|   |   |-- n01484850
|   |   ...  
|-- imagenet_class_index.json
`-- images
    |-- attribution
    |   |-- n01440764
    |   |-- n01443537
    |   |-- n01484850
    |   ...
    `-- estimation 
    |   |-- n01440764
    |   |-- n01443537
    |   |-- n01484850
    |   ...
```
Note that the annotations/ directory is only necessary for evaluating localization ability of attribution methods (the EHR metric proposed in the paper). One can modify line 35 in the config file to with_bbox=False, if no bounding box annotations are available.

We also provide a preprocessed small ImageNet dataset, which can be downloaded from this link
Create a directory under this repository: mkdir data, and link the imagenet data path to data/imagenet : ln -s path/to/imagenet_data/ data/imagenet.
Create a directory to store the output files mkdir workdirs.

Run training script with specified configuration file (e.g. vgg16_imagenet) to train the attributor:

python tools/vision/train.py \
    configs/vgg_imagenet.py \
    --work-dir workdirs/vgg_imagenet/ \
    --gpu-id 0 \
    --pbar

Check the results saved in workdirs/vgg_imagenet/: input_masks/ contains the final attribution maps, while feat_masks/ contains the attribution maps produced by the IB at feature map level (the original IBA)

NLP Task: Sentence Classification

We provide a pretrained multi-layer LSTM on IMDb dataset. Download the checkpoint file from this link.
mkdir pretrained and move the downloaded checkpoint file to pretrained.

Run training script with specified configuration (deep_lstm) to train the attributor:

python tools/nlp/train_nlp.py \
    configs/deep_lstm.py \
    --work-dir workdirs/lstm_imdb/ \
    --gpu-id 0 \
    --pbar

Check the results saved in workdirs/lstm_imdb/: input_masks/ contains the final attribution maps (at word level), while feat_masks/ contains the attribution maps produced by the IB at feature map level.

Pre-trained Models

Like many attribution methods, our method can only be applied in a per-image manner. For each new image, the Attributor will train new components (FeatureIBA, WGAN, InputIBA). Attribution methods are used for explain already trained models. Thus, there is no need to provide any pre-trained models here.

Run Evaluation

We implemented a handful of evaluation metrics including Sanity Check, Insertion/Deletion, Sensitivity-N, and our own proposed metric called EHR (Effective Heat Ratios). Details of how to run evaluations on attribution methods can be found in input_iba/evaluation or in this tutorial.

License

This repository is released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
configs		configs
input_iba		input_iba
resources		resources
tests		tests
tools		tools
tutorials		tutorials
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InputIBA: Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Example Results

Requirements

Run Attribution

Jupytor Notebook as hands-on tutorial

Batch-wise attribution generation

Computer Vision Task: Image Classification

NLP Task: Sentence Classification

Pre-trained Models

Run Evaluation

License

About

Releases

Packages

Contributors 3

Languages

CAMP-eXplain-AI/InputIBA

Folders and files

Latest commit

History

Repository files navigation

InputIBA: Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Example Results

Requirements

Run Attribution

Jupytor Notebook as hands-on tutorial

Batch-wise attribution generation

Computer Vision Task: Image Classification

NLP Task: Sentence Classification

Pre-trained Models

Run Evaluation

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages