Skip to content
/ XAI Public

Papers and code of Explainable AI esp. w.r.t. Image classificiation

Notifications You must be signed in to change notification settings

samzabdiel/XAI

Repository files navigation

XAI

Open source tools

Papers and code of Explainable AI esp. w.r.t. Image classificiation

2013 Conference Papers

Title Paper Title Source Link Code Tags
Visualization of CNN Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps CVPR2013 PyTorch Visualization gradient-based saliency maps

2016 Conference Papers

Title Paper Title Source Link Code Tags
CAM Learning Deep Features for Discriminative Localization CVPR2016 PyTorch (Official) class activation mapping
LIME “Why Should I Trust You?”Explaining the Predictions of Any Classifier KDD2016 PyTorch (Official) trust a prediction

2017 Conference Papers

Title Paper Title Source Link Code Tags
Grad-CAM Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization ICCV2017, CVPR2016 (original) PyTorch Visualization gradient-based saliency maps
Network Dissection Network Dissection: Quantifying Interpretability of Deep Visual Representations CVPR2017 PyTorch (Official) Visualization

2018 Conference Papers

Title Paper Title Source Link Code Tags
TCAV Interpretability Beyond Feature Attribution:Quantitative Testing with Concept Activation Vectors (TCAV) ICML 2018 Tensorflow 1.15.2 interpretability method
Interpretable CNN Interpretable Convolutional Neural Networks CVPR 2018 Tensorflow 1.x explainability by design
Anchors Anchors: High-Precision Model-Agnostic Explanations AAAI 2018 sklearn (Official) model-agnostic
Sanity Checks Sanity checks for saliency maps NeurIPS 2018 PyTorch saliency methods vs edge detector
Grad Cam++ Grad Cam++:Improved Visual Explanations forDeep Convolutional Networks WACV 2018 PyTorch saliency maps
Interpretable Basis Interpretable Basis Decomposition for Visual Explanation ECCV 2018 PyTorch ibd

2019 Conference Papers

Title Paper Title Source Link Code Tags
Full-grad Full-Gradient Representation for Neural Network Visualization NeurIPS2019 PyTorch (Official) Tensorflow saliency map representation
This looks like that This Looks Like That: Deep Learning for Interpretable Image Recognition NeurIPS2019 PyTorch (Official) object
Counterfactual visual explanations Counterfactual visual explanations ICML2019 interpretability
concept with contribution interpretable cnn Explaining Neural Networks Semantically and Quantitatively ICCV 2019
SIS What made you do this? Understanding black-box decisions with sufficient input subsets AISTATS 2019 - Supplementary Material Tensorflow 1.x
Filter as concept detector Filters in Convolutional Neural Networks as Independent Detectors of Visual Concepts ACM

2020 Papers

Title Paper Title Source Link Code Tags
INN Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs ECCV 2020 PyTorch explainability by design
X-Grad CAM Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs PyTorch
Revisiting BP saliency There and Back Again: Revisiting Backpropagation Saliency Methods CVPR 2020 PyTorch grad cam failure noted
Interacting with explanation Making deep neural networks right for the right scientific reasons by interacting with their explanations Nature Machine Intelligence sklearn
Class specific Filters Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters ECCV Supplementary Material Code - not yet updated ICLR rejected version with reviews
Interpretable Decoupling Interpretable Neural Network Decoupling ECCV 2020
iCaps iCaps: An Interpretable Classifier via Disentangled Capsule Networks ECCV Supplementary Material
VQA Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision ECCV 2020 PyTorch
When explanations lie When Explanations Lie: Why Many Modified BP Attributions Fail ICML 2020 PyTorch
Similarity models Towards Visually Explaining Similarity Models Arxiv
Quantify trust How Much Should I Trust You? Modeling Uncertainty of Black Box Explanations NeurIPS 2020 submission hima_lakkaraju,sameer_singh,model-agnostic
Concepts for segmentation task ABSTRACTING DEEP NEURAL NETWORKS INTO CONCEPT GRAPHS FOR CONCEPT LEVEL INTERPRETABILITY Arxiv Tensorflow 1.14 brain tumour segmentation
Deep Lift based Network Pruning Utilizing Explainable AI for Quantization and Pruning of Deep Neural Networks Arxiv NeurIPS format nas,deep_lift
Unifed Attribution Framework A Unified Taylor Framework for Revisiting Attribution Methods Arxivupdated taylor,attribution_framework
Global Cocept Attribution Towards Global Explanations of Convolutional Neural Networks with Concept Attribution CVPR 2020
relevance estimation Determining the Relevance of Features for Deep Neural Networks ECCV 2020
localized concept maps Explaining AI-based Decision Support Systems using Concept Localization Maps Arxiv Just repository created
quantify saliency Quantifying Explainability of Saliency Methods in Deep Neural Networks Arxiv PyTorch
generalization of LIME - MeLIME MeLIME: Meaningful Local Explanation for Machine Learning Models Arxiv Tensorflow 1.15
global counterfactual explanations Interpretable and Interactive Summaries of Actionable Recourses Arxiv
fine grained counterfactual heatmaps SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition Arxiv PyTorch scouter
quantify trust How Much Can We Really Trust You? Towards Simple, Interpretable Trust Quantification Metrics for Deep Neural Networks Arxiv
Non-negative concept activation vectors IMPROVING INTERPRETABILITY OF CNN MODELS USING NON-NEGATIVE CONCEPT ACTIVATION VECTORS Arxiv
different layer activations Explaining Neural Networks by Decoding Layer Activations Arxiv
concept bottleneck networks Concept Bottleneck Models ICML 2020 PyTorch
attribution Visualizing the Impact of Feature Attribution Baselines Distill
CSI Contextual Semantic Interpretability Arxiv explainable_by_design
Improve black box via explanation Introspective Learning by Distilling Knowledge from Online Self-explanation Arxiv kowledge_distillation
Patch explanations Information-Theoretic Visual Explanation for Black-Box Classifiers Arxiv Tensorflow 1.13.1 patch_sampling,information_theory
Causality Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect NeurIPS 2020 PyTorch
Concept in Time series data Conceptual Explanations of Neural Network Prediction for Time Series IJCNN 2020 time series, see if useful someway
Explainable by Design Trustworthy Convolutional Neural Networks:A Gradient Penalized-based Approach Arxiv
Colorwise Saliency Visualizing Color-wise Saliency of Black-Box Image Classification Models Arxiv
concept based Concept Discovery for The Interpretation of Landscape Scenicness Downloadable File
Integrated Score CAM IS-CAM: Integrated Score-CAM for axiomatic-based explanations Arxiv
Grad LAM Grad-LAM: Visualization of Deep Neural Networks for Unsupervised Learning EURASIP 2020
Cites TCAV Integrating Intrinsic and Extrinsic Explainability: The Relevance of Understanding Neural Networks for Human-Robot Interaction AAAI 2020
Attribution Learning Propagation Rules for Attribution Map Generation Arxiv
Zoom CAM Zoom-CAM: Generating Fine-grained Pixel Annotations from Image Labels Arxiv must read before modularity proposal
Masking based saliency maps investigation INVESTIGATING AND SIMPLIFYING MASKING-BASED SALIENCY MAP METHODS FOR MODEL INTERPRETABILITY Arxiv PyTorch
Evaluation Evaluating Attribution Methods using White-Box LSTMs EMNLP Workshop PyTorch cites TCAV, says all explanations fail their test
Interpretable Bayesian Neural Networks Incorporating Interpretable Output Constraints in Bayesian Neural Networks NeurIPS 2020 PyTorch
Survey - Counterfactual explanations Counterfactual Explanations for Machine Learning: A Review Arxiv
Standardised Explainability The Need for Standardised Explainability ICML 2020 Workshop
CME Now You See Me (CME): Concept-based Model Extraction CIKM 2020 workshop sklearn
Q FIT Q-FIT: The Quantifiable Feature Importance Technique for Explainable Machine Learning Arxiv
Outside black box Learning outside the Black-Box: The pursuit of interpretable models NeurIPS 2020 sklearn
Discrete Mask Interpreting Image Classifiers by Generating Discrete Masks IEEE - PAMI
Contrastive explanations Learning Global Transparent Models Consistent with Local Contrastive Explanations NeurIPS 2020
Empirical study of Ideal Explanations How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods NeurIPS 2020 tensorflow 1.15 Example based matching library
This Looks Like That + Relevance This Looks Like That, Because ... Explaining Prototypes for Interpretable Image Recognition Arxiv PyTorch must read before relevance
Concept based posthoc ProtoViewer: Visual Interpretation and Diagnostics of Deep Neural Networks with Factorized Prototypes Paper refer human subject experiments
Shapley Flow Shapley Flow: A Graph-based Approach to Interpreting Model Predictions Arxiv
Attention Vs Saliency and Beyond The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? Arxiv
Unification of removal methods Feature Removal Is A Unifying Principle For Model Explanation Methods NeurIPS 2020 workshop PyTorch from the authors of SHAPExtended Arxiv version
Robust and Stable Black Box Explanations Robust and Stable Black Box Explanations ICML 2020 hima lakkaraju
Debugging test Debugging Tests for Model Explanations Arxiv
AISTATS 2020 submission Ensuring Actionable Recourse via Adversarial Training Arxiv hima lakkaraju
Layer wise explanation Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change ResearchGate
cites TCAV Debiasing Convolutional Neural Networks via Meta Orthogonalization Arxiv Code page not found
Introducing concepts SeXAI: Introducing Concepts into Black Boxes for Explainable Artificial Intelligence Paper Tensorflow 1.4
Additive explainers Learning simplified functions to understand Paper
BIN Born Identity Network: Multi-way Counterfactual Map Generation to Explain a Classifier’s Decision Arxiv Tensorflow 2.2 counterfactual explanations
Explantion using Generative models Explaining image classifiers by removing input features using generative models ACCV 2020 Tensorflow 1.12 & Pytorch 1.1 Nguyen's paper
Action Recognition Explanation Play Fair: Frame Attributions in Video Models ACCV 2020 PyTorch
Concepts in VQA Interpretable Visual Reasoning via Induced Symbolic Space Arxiv Code not yet updated, just repository created
Recourses Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses NeurIPS 2020 hima lakkaraju
Feature Importance of CNN Measuring Feature Importance of Convolutional Neural Networks IEEE
Causal Inference Causal inference using deep neural networks Arxiv Keras
Match up Match Them Up: Visually Explainable Few-shot Image Classification Arxiv PyTorch
Right for the Right Concept Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations Arxiv
MALC Transparency Promotion with Model-Agnostic Linear Competitors ICML 2020
Shapley Taylor Index The Shapley Taylor Interaction Index ICML 2020
Concept based explanation + user feedback Teaching the Machine to Explain Itself using Domain Knowledge Openreview
Counterfactual produces Adversarial Semantics and explanation: why counterfactual explanations produce adversarial examples in deep neural networks AIJ submission
MEME MEME: Generating RNN Model Explanations via Model Extraction OpenReview Keras RNN specific LIME, see if any improvisations for MACE comes from here
ProtoPShare ProtoPShare: Prototype Sharing for Interpretable Image Classification and Similarity Discovery Arxiv - Accepted at ACM SIGKDD 2021 PyTorch Improved ProtoPNet (This looks like that)
RANCC RANCC: Rationalizing Neural Networks via Concept Clustering ACL Tensorflow 1.x
EAN Efficient Attention Network: Accelerate Attention by Searching Where to Plug Arxiv PyTorch
LIME Analysis Why model why? Assessing the strengths and limitations of LIME Arxiv sklearn
Rethink positive aggregation Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods ICML 2020 workshop WHI
Pixel wise interpretation metric A Metric to Compare Pixel-wise Interpretation Methods for Neural Networks IEEE
Latent space debiasing Fair Attribute Classification through Latent Space De-biasing Arxiv PyTorch
Explanation - Teacher Student Evaluating Explanations: How much do explanations from the teacher aid students? Arxiv
Neural Prototype Trees Neural Prototype Trees for Interpretable Fine-grained Image Recognition Arxiv PyTorch same group of This looks like that + relevance
FixOut FixOut: an ensemble approach to fairer models Paper
Concepts on Tabular data Learning Interpretable Concept-Based Models with Human Feedback Arxiv
BayLIME BayLIME: Bayesian Local Interpretable Model-Agnostic Explanations Arxiv Keras
PPI Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models Arxiv Anonymous PyTorch code link given
Generalized distillation Understanding Interpretability by generalized distillation in Supervised Classification AAAI 2021 submission Code will be public upon acceptance
RIG A Singular Value Perspective on Model Robustness Arxiv
Activation analysis Explaining Predictions of Deep Neural Classifier via Activation Analysis Arxiv
Evaluation metrics Evaluating Explainable Methods for Predictive Process Analytics: A Functionally-Grounded Approach Arxiv sklearn
Explanations based on train set Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction Arxiv
DAX DAX: Deep Argumentative eXplanation for Neural Networks Arxiv
Debiased CAM Debiased-CAM for bias-agnostic faithful visual explanations of deep convolutional networks Arxiv Tensorflow 2.1.0 lot of human subject experiments found
Bias via explanation Investigating Bias in Image Classification using Model Explanations ICML WHI 2020
Shapley Credit Allocation On Shapley Credit Allocation for Interpretability Arxiv
Dependency Decomposition Dependency Decomposition and a Reject Option for Explainable Models Arxiv
Interpretation Network xRAI: Explainable Representations through AI Arxiv
Explainable by Design Evolutionary Generative Contribution Mappings IEEE explainable by design
Transformer Explanation Transformer Interpretability Beyond Attention Visualization Arxiv CVPR format PyTorch
MANE MANE: Model-Agnostic Non-linear Explanations for Deep Learning Model IEEE see how similar to MAIRE
Why and Why Not Explanations On Relating ‘Why?’ and ‘Why Not?’ Explanations Arxiv sklearn gives theoretical relationship between feature importance and counterfactual techniques
cites ACE Analyzing Representations inside Convolutional Neural Networks Arxiv PyTorch
CEN CEN: Concept Evolution Network for Image Classification Tasks ACM RICAI 2020 explainable by design
Quantitative evaluation metrics Quantitative Evaluations on Saliency Methods: An Experimental Study Arxiv
Integrating black box and Interpretable model IB-M: A Flexible Framework to Align an Interpretable Model and a Black-box Model IEEE - BIBM 2020
X-GradCAM Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs BMVC 2020
RCAV Robust Semantic Interpretability: Revisiting Concept Activation Vectors ICML WHI 2020 PyTorch

2021 Papers

Title Paper Title Source Link Code Tags
Debiasing concepts Debiasing Concept Bottleneck Models with Instrumental Variables ICLR 2021 submissions page - Accepted as Poster causality
Prototype Trajectory Interpretable Sequence Classification Via Prototype Trajectory ICLR 2021 submissions page this looks like that styled RNN
Shapley dependence assumption Shapley explainability on the data manifold ICLR 2021 submissions page
High dimension Shapley Human-interpretable model explainability on high-dimensional data ICLR 2021 submissions page
L2x like paper A Learning Theoretic Perspective on Local Explainability ICLR 2021 submissions page
Evaluation Evaluation of Similarity-based Explanations ICLR 2021 submissions page like adebayo paper for this looks like that styled methods
Model correction Defuse: Debugging Classifiers Through Distilling Unrestricted Adversarial Examples ICLR 2021 submissions page
Subspace explanation Constraint-Driven Explanations of Black-Box ML Models ICLR 2021 submissions page to see how close to MUSE by Hima Lakkaraju 2019
Catastrophic forgetting Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting ICLR 2021 submissions page Code available in their Supplementary zip file
Non trivial counterfactual explanations Beyond Trivial Counterfactual Generations with Diverse Valuable Explanations ICLR 2021 submissions page
Explainable by Design Interpretability Through Invertibility: A Deep Convolutional Network With Ideal Counterfactuals And Isosurfaces ICLR 2021 submissions page
Gradient attribution Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability ICLR 2021 submissions page looks like extension of Sixt et al paper
Mask based Explainable by Design Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability ICLR 2021 submissions page
NBDT - Explainable by Design NBDT: Neural-Backed Decision Tree ICLR 2021 submissions page
Variational Saliency Maps Variational saliency maps for explaining model's behavior ICLR 2021 submissions page
Network dissection with coherency or stability metric Importance and Coherence: Methods for Evaluating Modularity in Neural Networks ICLR 2021 submissions page
Modularity Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks ICLR 2021 submissions page Code made anonymous for review, link given in paper
Explainable by design A self-explanatory method for the black problem on discrimination part of CNN ICLR 2021 submissions page seems concepts of game theory applied
Attention not Explanation Why is Attention Not So Interpretable? ICLR 2021 submissions page
Ablation Saliency Ablation Path Saliency ICLR 2021 submissions page
Explainable Outlier Detection Explainable Deep One-Class Classification ICLR 2021 submissions page
XAI without approximation Explainable AI Wthout Interpretable Model Arxiv
Learning theoretic Local Interpretability A LEARNING THEORETIC PERSPECTIVE ON LOCAL EXPLAINABILITY Arxiv
GANMEX GANMEX: ONE-VS-ONE ATTRIBUTIONS USING GAN-BASED MODEL EXPLAINABILITY Arxiv
Evaluating Local Explanations Evaluating local explanation methods on ground truth Artificial Intelligence Journal Elsevier sklearn
Structured Attention Graphs Structured Attention Graphs for Understanding Deep Image Classifications AAAI 2021 PyTorch see how close to MACE
Ground truth explanations Data Representing Ground-Truth Explanations to Evaluate XAI Methods AAAI 2021 sklearn trained models available in their github repository
AGF Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization AAAI 2021 PyTorch
RSP Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations AAAI 2021
HyDRA HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks AAAI 2021 PyTorch
SWAG SWAG: Superpixels Weighted by Average Gradients for Explanations of CNNs WACV 2021
FastIF FASTIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging Arxiv PyTorch
EVET EVET: Enhancing Visual Explanations of Deep Neural Networks Using Image Transformations WACV 2021
Local Attribution Baselines On Baselines for Local Feature Attributions AAAI 2021 PyTorch
Differentiated Explanations Differentiated Explanation of Deep Neural Networks with Skewed Distributions IEEE - TPAMI journal PyTorch
Human game based survey Explainable AI and Adoption of Algorithmic Advisors: an Experimental Study Arxiv
Explainable by design Learning Semantically Meaningful Features for Interpretable Classifications Arxiv
Expred Explain and Predict, and then Predict again ACM WSDM 2021 PyTorch
Progressive Interpretation An Information-theoretic Progressive Framework for Interpretation Arxiv PyTorch
UCAM Uncertainty Class Activation Map (U-CAM) using Gradient Certainty method IEEE - TIP Project Page PyTorch
progressive GAN explainability- smiling dataset- ICLR 2020 group Explaining the Black-box Smoothly - A Counterfactual Approach Arxiv
Head pasted in another image - experimented WHAT DO DEEP NETS LEARN? CLASS-WISE PATTERNS REVEALED IN THE INPUT SPACE Arxiv
Model correction ExplOrs Explanation Oracles and the architecture of explainability Paper
Explanations - Knowledge Representation A Basic Framework for Explanations in Argumentation IEEE
Eigen CAM Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks Springer
Evaluation of Posthoc How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations ACM
GLocalX GLocalX - From Local to Global Explanations of Black Box AI Models Arxiv
Consistent Interpretations Explainable Models with Consistent Interpretations AAAI 2021
SIDU Introducing and assessing the explainable AI (XAI) method: SIDU Arxiv
cites This looks like that Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies AIJ
i-Algebra i-Algebra: Towards Interactive Interpretability of Deep Neural Networks AAAI 2021
Shape texture bias SHAPE OR TEXTURE: UNDERSTANDING DISCRIMINATIVE FEATURES IN CNNS ICLR 2021
Class agnostic features THE MIND’S EYE: VISUALIZING CLASS-AGNOSTIC FEATURES OF CNNS Arxiv
IBEX A Multi-layered Approach for Tailored Black-box Explanations Paper Code
Relevant explanations Learning Relevant Explanations Paper
Guided Zoom Guided Zoom: Zooming into Network Evidence to Refine Fine-grained Model Decisions IEEE
XAI survey A Survey on Understanding, Visualizations, and Explanation of Deep Neural Networks Arxiv
Pattern theory Convolutional Neural Network Interpretability with General Pattern Theory Arxiv PyTorch
Gaussian Process based explanations Bandits for Learning to Explain from Explanations AAAI 2021 sklearn
LIFT CAM LIFT-CAM: Towards Better Explanations for Class Activation Mapping Arxiv
ObAIEx Right for the Right Reasons: Making Image Classification Intuitively Explainable Paper tensorflow
VAE based explainer Combining an Autoencoder and a Variational Autoencoder for Explaining the Machine Learning Model Predictions IEEE
Segmentation based explanation Deep Co-Attention Network for Multi-View Subspace Learning Arxiv PyTorch
Integrated CAM INTEGRATED GRAD-CAM: SENSITIVITY-AWARE VISUAL EXPLANATION OF DEEP CONVOLUTIONAL NETWORKS VIA INTEGRATED GRADIENT-BASED SCORING ICASSP 2021 PyTorch
Human study VitrAI - Applying Explainable AI in the Real World Arxiv
Attribution Mask Attribution Mask: Filtering Out Irrelevant Features By Recursively Focusing Attention on Inputs of DNNs Arxiv PyTorch
LIME faithfulness What does LIME really see in images? Arxiv Tensorflow 1.x
Assess model reliability Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs Arxiv
Perturbation + Gradient unification Towards the Unification and Robustness of Perturbation and Gradient Based Explanations Arxiv hima lakkaraju
Gradients faithful? Do Input Gradients Highlight Discriminative Features? Arxiv PyTorch
Untrustworthy predictions Identifying Untrustworthy Predictions in Neural Networks by Geometric Gradient Analysis Arxiv
Explaining misclassification Explaining Inaccurate Predictions of Models through k-Nearest Neighbors Paper cites Oscar Li AAAI 2018 prototypes paper
Explanations inside predictions Have We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their Interpretations AISTATS 2021
Layerwise interpretation LAYER-WISE INTERPRETATION OF DEEP NEURAL NETWORKS USING IDENTITY INITIALIZATION Arxiv
Visualizing Rule Sets Visualizing Rule Sets: Exploration and Validation of a Design Space Arxiv PyTorch
Human experiments Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making IUI 2021
Attention fine-grained classification Interpretable Attention Guided Network for Fine-grained Visual Classification Arxiv
Concept construction Explaining Classifiers by Constructing Familiar Concepts Paper PyTorch
EbD Human-Understandable Decision Making for Visual Recognition Arxiv
Bridging XAI algorithm , Human needs Towards Connecting Use Cases and Methods in Interpretable Machine Learning Arxiv
Generative trustworthy classifiers Generative Classifiers as a Basis for Trustworthy Image Classification Paper Github
Counterfactual explanations Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties AISTATS 2021 PyTorch
Role categorization of CNN units Quantitative Effectiveness Assessment and Role Categorization of Individual Units in Convolutional Neural Networks ICML 2021
Non-trivial counterfactual explanations Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations Arxiv
NP-ProtoPNet These do not Look Like Those: An Interpretable Deep Learning Model for Image Recognition IEEE
Correcting neural networks based on explanations Refining Neural Networks with Compositional Explanations Arxiv Code link given in paper, but page not found
Contrastive reasoning Contrastive Reasoning in Neural Networks Arxiv
Concept based Intersection Regularization for Extracting Semantic Attributes Arxiv
Boundary explanations Boundary Attributions Provide Normal (Vector) Explanations Arxiv PyTorch
Generative Counterfactuals ECINN: Efficient Counterfactuals from Invertible Neural Networks Arxiv
ICE Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors AAAI 2021
Group CAM Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks Arxiv PyTorch
HMM interpretability Towards interpretability of Mixtures of Hidden Markov Models AAAI 2021 sklearn
Empirical Explainers Efficient Explanations from Empirical Explainers Arxiv PyTorch
FixNorm FIXNORM: DISSECTING WEIGHT DECAY FOR TRAINING DEEP NEURAL NETWORKS Arxiv
CoDA-Net Convolutional Dynamic Alignment Networks for Interpretable Classifications CVPR 2021 Code link given in paper. Repository not yet created
Like Dr. Chandru sir's (IITPKD) XAI work Neural Response Interpretation through the Lens of Critical Pathways Arxiv PyTorch- Pathway GradPyTorch - ROAR
Inaugment InAugment: Improving Classifiers via Internal Augmentation Arxiv Code yet to be updated
Gradual Grad CAM Enhancing Deep Neural Network Saliency Visualizations with Gradual Extrapolation Arxiv PyTorch
A-FMI A-FMI: LEARNING ATTRIBUTIONS FROM DEEP NETWORKS VIA FEATURE MAP IMPORTANCE Arxiv
Trust - Regression To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions AAAI 2021 sklearn
Concept based explanations - study IS DISENTANGLEMENT ALL YOU NEED? COMPARING CONCEPT-BASED & DISENTANGLEMENT APPROACHES ICLR 2021 workshop tensorflow 2.3
Faithful attribution Mutual Information Preserving Back-propagation: Learn to Invert for Faithful Attribution Arxiv
Counterfactual explanation Counterfactual attribute-based visual explanations for classification Springer
User based explanations That's (not) the output I expected!” On the role of end user expectations in creating explanations of AI systems AIJ
Human understandable concept based explanations Towards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be Removed Arxiv
Improved attribution Improving Attribution Methods by Learning Submodular Functions Arxiv
SHAP tractability On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results Arxiv
SHAP explanation network SHAPLEY EXPLANATION NETWORKS ICLR 2021 PyTorch
Concept based dataset shift explanation FAILING CONCEPTUALLY: CONCEPT-BASED EXPLANATIONS OF DATASET SHIFT ICLR 2021 workshop tensorflow 2
EbD Towards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be Removed Arxiv
Evaluating CAM Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis Arxiv
EFC-CAM Exclusive Feature Constrained Class Activation Mapping for Better Visual Explanation IEEE
Causal Interpretation Instance-wise Causal Feature Selection for Model Interpretation Arxiv PyTorch
Fairness in Learning Learning to Learn to be Right for the Right Reasons Arxiv
Feature attribution correctness Do Feature Attribution Methods Correctly Attribute Features? Arxiv Code not yet updated
NICE NICE: AN ALGORITHM FOR NEAREST INSTANCE COUNTERFACTUAL EXPLANATIONS Arxiv Own Python Package
SCG A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts Arxiv
Visual Concepts A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts Arxiv
This looks like that - drawback This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks Arxiv PyTorch
Exemplar based classification Visualizing Association in Exemplar-Based Classification ICASSP 2021
Correcting classification CORRECTING CLASSIFICATION: A BAYESIAN FRAMEWORK USING EXPLANATION FEEDBACK TO IMPROVE CLASSIFICATION ABILITIES Arxiv
Concept Bottleneck Networks DO CONCEPT BOTTLENECK MODELS LEARN AS INTENDED? ICLR workshop 2021
Sanity for saliency Sanity Simulations for Saliency Methods Arxiv
Concept based explanations Cause and Effect: Concept-based Explanation of Neural Networks Arxiv
CLIMEP How to Explain Neural Networks: A perspective of data space division Arxiv
Sufficient explanations Probabilistic Sufficient Explanations Arxiv Empty Repository
SHAP baseline Learning Baseline Values for Shapley Values Arxiv
Explainable by Design EXoN: EXplainable encoder Network Arxiv tensorflow 2.4.0 explainable VAE
Concept based explanations Aligning Artificial Neural Networks and Ontologies towards Explainable AI AAAI 2021
XAI via Bayesian teaching ABSTRACTION, VALIDATION, AND GENERALIZATION FOR EXPLAINABLE ARTIFICIAL INTELLIGENCE Arxiv
Explanation blind spots DO NOT EXPLAIN WITHOUT CONTEXT: ADDRESSING THE BLIND SPOT OF MODEL EXPLANATIONS Arxiv
BLA Bounded logit attention: Learning to explain image classifiers Arxiv tensorflow L2X++
Interpretability - mathematical model The Definitions of Interpretability and Learning of Interpretable Models Arxiv
Similar to our ICML workshop 2021 work The effectiveness of feature attribution methods and its correlation with automatic evaluation scores Arxiv
EDDA EDDA: Explanation-driven Data Augmentation to Improve Model and Explanation Alignment Arxiv
Relevant set explanations Efficient Explanations With Relevant Sets Arxiv
Model transfer Making CNNs Interpretable by Building Dynamic Sequential Decision Forests with Top-down Hierarchy Learning Arxiv
Model correction Finding and Fixing Spurious Patterns with Explanations Arxiv
Neuron graph communities On the Evolution of Neuron Communities in a Deep Learning Architecture Arxiv
Mid level features explanations A general approach for Explanations in terms of Middle Level Features Arxiv see how different from MUSE by Hima Lakkaraju group
Concept based knowledge distillation Towards Black-Box Explainability with Gaussian Discriminant Knowledge Distillation CVPR 2021 workshop compare and contrast with network dissection
CNN high frequency bias Dissecting the High-Frequency Bias in Convolutional Neural Networks CVPR 2021 workshop Tensorflow
Explainable by design Entropy-based Logic Explanations of Neural Networks Arxiv PyTorch concept based
CALM Keep CALM and Improve Visual Feature Attribution Arxiv PyTorch
Relevance CAM Relevance-CAM: Your Model Already Knows Where to Look CVPR 2021 PyTorch
S-LIME S-LIME: Stabilized-LIME for Model Explanation Arxiv sklearn
Local + Global Best of both worlds: local and global explanations with human-understandable concepts Arxiv Been Kim's group
Guided integrated gradients Guided Integrated Gradients: an Adaptive Path Method for Removing Noise CVPR 2021
Concept based Meaningfully Explaining a Model’s Mistakes Arxiv
Explainable by design It’s FLAN time! Summing feature-wise latent representations for interpretability Arxiv
SimAM SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks ICML 2021 PyTorch
DANCE DANCE: Enhancing saliency maps using decoys ICML 2021 Tensorflow 1.x
EbD Concept formation Explore Visual Concept Formation for Image Classification ICML 2021 PyTorch
Explainable by design Interpretable Compositional Convolutional Neural Networks Arxiv
Attribution aggregation Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation AAAI 2021 - pdf
Perturbation based activation A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation AAAI 2021
Global explanations Feature Synergy, Redundancy, and Independence in Global Model Explanations using SHAP Vector Decomposition Arxiv Github package
L2E Learning to Explain: Generating Stable Explanations Fast ACL 2021 PyTorch NLE
Joint Shapley Joint Shapley values: a measure of joint feature importance Arxiv
Explainable by design Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment Arxiv
Explainable by design SONG: SELF-ORGANIZING NEURAL GRAPHS Arxiv
Explainable by design Designing Shapelets for Interpretable Data-Agnostic Classification AIES 2021 sklearn Interpretable block of time series extended to other data modalitites like image, text, tabular
Global explanations + Model correction Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability Arxiv PyTorch
HIL- Model correction Human-in-the-loop Extraction of Interpretable Concepts in Deep Learning Models Arxiv
Activation based Cause Analysis Activation-Based Cause Analysis Method for Neural Networks IEEE Access 2021
Local explanations Leveraging Latent Features for Local Explanations ACM SIGKDD 2021 Amit Dhurandhar group
Fairness Adequate and fair explanations Arxiv - Accepted in CD-MAKE 2021
Global explanations Finding Representative Interpretations on Convolutional Neural Networks ICCV 2021
Groupwise explanations Learning Groupwise Explanations for Black-Box Models IJCAI 2021 PyTorch
Mathematical On Smoother Attributions using Neural Stochastic Differential Equations IJCAI 2021
AGI Explaining Deep Neural Network Models with Adversarial Gradient Integration IJCAI 2021 PyTorch
Accountable attribution Longitudinal Distance: Towards Accountable Instance Attribution Arxiv Tensorflow Keras
Global explanation Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images Arxiv
Concepts based - Explainable by design Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach Arxiv IITH Vineeth sir group
Explainable by design This looks more like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation Arxiv
MIL ProtoMIL: Multiple Instance Learning with Prototypical Parts for Fine-Grained Interpretability Arxiv
Concept based explanations Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation Arxiv
Counterfactual explanation + Theory of Mind CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models Arxiv
Evaluation metric Counterfactual Evaluation for Explainable AI Arxiv
CIM - FSC CIM: Class-Irrelevant Mapping for Few-Shot Classification Arxiv
Causal Concepts Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation Arxiv
ECE Ensemble of Counterfactual Explainers Paper Code - seems hybrid of tf and torch
Structured Explanations From Heatmaps to Structured Explanations of Image Classifiers Arxiv
XAI metric An Objective Metric for Explainable AI - How and Why to Estimate the Degree of Explainability Arxiv
DisCERN DisCERN:Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods Arxiv
PSEM Towards Better Model Understanding with Path-Sufficient Explanations Arxiv Amit Dhurandhar sir group
Evaluation traps The Logic Traps in Evaluating Post-hoc Interpretations Arxiv
Interactive explanations Explainability Requires Interactivity Arxiv PyTorch
CounterNet CounterNet: End-to-End Training of Counterfactual Aware Predictions Arxiv PyTorch
Evaluation metric - Concept based explanation Detection Accuracy for Evaluating Compositional Explanations of Units Arxiv
Explanation - Uncertainity Effects of Uncertainty on the Quality of Feature Importance Explanations Arxiv
Survey Paper TOWARDS USER-CENTRIC EXPLANATIONS FOR EXPLAINABLE MODELS: A REVIEW JISTM Journal Paper
Feature attribution The Struggles and Subjectivity of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets AAAI 2021 workshop
Contextual explanation Context-based image explanations for deep neural networks Image and Vision Computing Journal
Causal + Counterfactual Counterfactual Instances Explain Little Arxiv
Case based Posthoc Explaining Deep Learning using examples: Optimal feature weighting methods for twin systems using post-hoc, explanation-by-example in XAI Elsevier
Debugging gray box model Toward a Unified Framework for Debugging Gray-box Models Arxiv
Explainable by design Optimising for Interpretability: Convolutional Dynamic Alignment Networks Arxiv
XAI negative effect Explainability Pitfalls: Beyond Dark Patterns in Explainable AI Arxiv
Evaluate attributions WHO EXPLAINS THE EXPLANATION? QUANTITATIVELY ASSESSING FEATURE ATTRIBUTION METHODS Arxiv
Counterfactual explanations Designing Counterfactual Generators using Deep Model Inversion Arxiv
Model correction using explanation Consistent Explanations by Contrastive Learning Arxiv
Visualize feature maps Visualizing Feature Maps for Model Selection in Convolutional Neural Networks ICCV 2021 Workshop Tensorflow 1.15
SPS Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition ICCV 2021 PyTorch
DMBP Generating Attribution Maps with Disentangled Masked Backpropagation ICCV 2021
Better CAM Towards Better Explanations of Class Activation Mapping ICCV 2021
LEG Statistically Consistent Saliency Estimation ICCV 2021 Keras
IBA Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information NeurIPS 2021 PyTorch
Looks similar to This Looks Like That Interpretable Image Recognition by Constructing Transparent Embedding Space ICCV 2021 Code not yet publicly released
Causal Imagenet CAUSAL IMAGENET: HOW TO DISCOVER SPURIOUS FEATURES IN DEEP LEARNING? Arxiv
Model correction Logic Constraints to Feature Importances Arxiv
Receptive field Misalignment CAM On the Receptive Field Misalignment in CAM-based Visual Explanations Pattern recognition Letters PyTorch
Simplex Explaining Latent Representations with a Corpus of Examples Arxiv PyTorch
Sanity checks Revisiting Sanity Checks for Saliency Maps Arxiv - NeurIPS 2021 workshop
Model correction Debugging the Internals of Convolutional Networks PDF
SITE Self-Interpretable Model with Transformation Equivariant Interpretation Arxiv Accepted at NeurIPS 2021 EbD
Influential examples Revisiting Methods for Finding Influential Examples Arxiv
SOBOL Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis NeurIPS 2021 Tensorflow and PyTorch
Feature vectors Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics Arxiv global interpretability
OOD in explainability The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations NeurIPS 2021 sklearn
RPS LJE Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models NeurIPS 2021 PyTorch
Model correction Editing a Classifier by Rewriting Its Prediction Rules NeurIPS 2021 Code
suppressor variable litmus test Scrutinizing XAI using linear ground-truth data with suppressor variables Arxiv
Explainable knowledge distillation Learning Interpretation with Explainable Knowledge Distillation Arxiv
STEEX STEEX: Steering Counterfactual Explanations with Semantics Arxiv Code
Binary counterfactual explanation Counterfactual Explanations via Latent Space Projection and Interpolation Arxiv
ECLAIRE Efficient Decompositional Rule Extraction for Deep Neural Networks Arxiv R
CartoonX Cartoon Explanations of Image Classifiers Researchgate
concept based explanation Explanations in terms of Hierarchically organised Middle Level Features Paper see how close to MACE and PACE
Concept ball Ontology-based 𝑛-ball Concept Embeddings Informing Few-shot Image Classification Paper
SPARROW SPARROW: Semantically Coherent Prototypes for Image Classification BMVC 2021
XAI evaluation criteria Objective criteria for explanations of machine learning models Paper
Code inversion with human perception EXPLORING ALIGNMENT OF REPRESENTATIONS WITH HUMAN PERCEPTION Arxiv
Deformable ProtoPNet Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes Arxiv
ICSN Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations Arxiv
HIVE HIVE: Evaluating the Human Interpretability of Visual Explanations Arxiv Project Page
Jitter CAM Jitter-CAM: Improving the Spatial Resolution of CAM-Based Explanations BMVC 2021 PyTorch
Interpreting last layer dentifying Class Specific Filters with L1 Norm Frequency Histograms in Deep CNNs Arxiv
FCP Forward Composition Propagation for Explainable Neural Reasoning Arxiv
Protopool Interpretable Image Classification with Differentiable Prototypes Assignment Arxiv
PRELIM Pedagogical Rule Extraction for Learning Interpretable Models Arxiv
Fair correction vectors FAIR INTERPRETABLE LEARNING VIA CORRECTION VECTORS ICLR 2021
Smooth LRP SmoothLRP: Smoothing LRP by Averaging over Stochastic Input Variations ESANN 2021
Causal CAM EXTRACTING CAUSAL VISUAL FEATURES FOR LIMITED LABEL CLASSIFICATION ICIP 2021

2022 Papers

Title Paper Title Source Link Code Tags
SNI Semantic Network Interpretation WACV 2022
F-CAM F-CAM: Full Resolution Class Activation Maps via Guided Parametric Upscaling WACV 2022 PyTorch
PCACE PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability Arxiv
Evaluating Attribution methods Evaluating Attribution Methods in Machine Learning Interpretability IEEE International Conference on Big Data
X-decision making Explainable Decision Making with Lean and Argumentative Explanations Arxiv
Include domain knowledge to neural network A review of some techniques for inclusion of domain‑knowledge into deep neural networks Nature
CNN Hierarchical Decomposition Deeply Explain CNN via Hierarchical Decomposition Arxiv
Explanatory learning EXPLANATORY LEARNING: BEYOND EMPIRICISM IN NEURAL NETWORKS Arxiv
Conceptor CAM Conceptor Learning for Class Activation Mapping IEEE-TIP
Classifier orthogonalization CONTROLLING DIRECTIONS ORTHOGONAL TO A CLASSIFIER ICLR 2022 PyTorch
Attention not explanation Attention cannot be an Explanation Arxiv
CNN sensitivity analysis A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes Arxiv
Trusting extrapolation To what extent should we trust AI models when they extrapolate? Arxiv
LAP LAP: An Attention-Based Module for Faithful Interpretation and Knowledge Injection in Convolutional Neural Networks Arxiv concept based explanations
Saliency map evaluation metrics Metrics for saliency map evaluation of deep learning explanation methods Arxiv
LINEX Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning Arxiv
ROAD Evaluating Feature Attribution: An Information-Theoretic Perspective Arxiv PyTorch
CBM-AUC Concept Bottleneck Model with Additional Unsupervised Concepts Arxiv
Explainability as dialogue Rethinking Explainability as a Dialogue: A Practitioner’s Perspective Arxiv
IAA Aligning Eyes between Humans and Deep Neural Network through Interactive Attention Alignment Arxiv
Plug in A Novel Plug-in Module for Fine-Grained Visual Classification Arxiv PyTorch
Hierarchical concepts Cause and Effect: Hierarchical Concept-based Explanation of Neural Networks Arxiv
Model correction by design LEARNING ROBUST CONVOLUTIONAL NEURAL NETWORKS WITH RELEVANT FEATURE FOCUSING VIA EXPLANATIONS Arxiv
Concept discovery Discovering Concepts in Learned Representations using Statistical Inference and Interactive Visualization Arxiv
Rare spurious correlation Understanding Rare Spurious Correlations in Neural Networks Arxiv PyTorch
Causal Matching Learned Causal Effects of Neural Networks with Domain Priors Arxiv
PYLON Improved image classification explainability with high accuracy heatmaps iScience Journal
Causal counterfactual REALISTIC COUNTERFACTUAL EXPLANATIONS BY LEARNED RELATIONS Arxiv
Argumentative Causal explanation Forging Argumentative Explanations from Causal Models Paper
EVA Don’t Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis Arxiv
Conceptual modelling ConceptSuperimposition: Using Conceptual Modeling Method for Explainable AI Paper
SIDU Visual Explanation of Black-Box Model : Similarity Difference and Uniqueness (SIDU) Method Pattern Recognition Journal Tensorflow 2.x
Explainable representations Explaining, Evaluating and Enhancing Neural Networks’ Learned Representations Arxiv
XAI Overview Explanatory Paradigms in Neural Networks Arxiv
Evaluating attribution methods Evaluating Feature Attribution Methods in the Image Domain Arxiv PyTorch
Prototype vector + perturbation The Need for Empirical Evaluation of Explanation Quality Arxiv
ADVISE ADVISE: ADaptive Feature Relevance and VISual Explanations for Convolutional Neural Networks Arxiv Matlab
Improving Grad CAM Improving the Interpretability of GradCAMs in Deep Classification Networks Science Direct
Explainable by design Interpretable part-whole hierarchies and conceptual-semantic relationships in neural networks CVPR 2022 PyTorch
CAMP Do Explanations Explain? Model Knows Best Arxiv PyTorch
Attribution stability RETHINKING STABILITY FOR ATTRIBUTION-BASED EXPLANATIONS Arxiv
SSSCD Sparse Subspace Clustering for Concept Discovery (SSCCD) Arxiv
Model improvement Beyond Explaining: Opportunities and Challenges of XAI-Based Model Improvement Arxiv
Causal explanations Trying to Outrun Causality in Machine Learning: Limitations of Model Explainability Techniques for Identifying Predictive Variables Arxiv sklearn
Causal explanations Diffusion Causal Models for Counterfactual Estimation Arxiv
Causal inference influence functions A Free Lunch with Influence Functions? Improving Neural Network Estimates with Concepts from Semiparametric Statistics Arxiv PyTorch
Causal discovery Causal discovery for observational sciences using supervised machine learning Arxiv
Causal DA Causal Domain Adaptation with Copula Entropy based Conditional Independence Test Arxiv
Causal experimental design Interventions, Where and How? Experimental Design for Causal Models at Scale Arxiv seems ICML format
Causal discovery SCORE MATCHING ENABLES CAUSAL DISCOVERY OF NONLINEAR ADDITIVE NOISE MODELS Arxiv
Causal Explanation - Cynthia Rudin WHY INTERPRETABLE CAUSAL INFERENCE IS IMPORTANT FOR HIGH-STAKES DECISION MAKING FOR CRITICALLY ILL PATIENTS AND HOW TO DO IT Arxiv
Semantically consistent counterfactuals Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals Arxiv
Posthoc global hypersphere Post-hoc Global Explanation using Hypersphere Sets ICAART 2022
CapsNet explanation Investigation of Capsule Networks Regarding their Potential of Explainability and Image Rankings ICAART 2022
XAI evaluation A Unified Study of Machine Learning Explanation Evaluation Metrics Arxiv
Concept based counterfactual explanations DISSECT: Disentangled Simultaneous Explanations via Concept Traversals ICLR 2022 tensorflow 1.12 Been Kim's group
concept evolution ConceptEvo: Interpreting Concept Evolution in Deep Learning Training Arxiv
Poly-CAM Backward recursive Class Activation Map refinement for high resolution saliency map Paper
Interactive Concept explanation ConceptExplainer: Interactive Explanation for Deep Neural Networks from a Concept Perspective Arxiv
Quasi ProtoPNet Think positive: An interpretable neural network for image recognition Neural Networks Journal
TAM VISUALIZING DEEP NEURAL NETWORKS WITH TOPOGRAPHIC ACTIVATION MAPS Arxiv
S-XAI Semantic interpretation for convolutional neural networks: What makes a cat a cat? Arxiv
See through DNN Perception Visualization: Seeing Through the Eyes of a DNN Arxiv
IOM Understanding CNNs from excitations Arxiv
KICE Integrating Prior Knowledge in Post-hoc Explanations Arxiv

About

Papers and code of Explainable AI esp. w.r.t. Image classificiation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published