Order in the Court: Explainable AI Methods Prone to Disagreement

less than 1 minute read

Our paper which was presentation at the ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI.

arxiv.org/abs/2105.03287

Abstract

By computing the rank correlation between attention weights and feature-additive explanation methods, previous analyses either invalidate or support the role of attention-based explanations as a faithful and plausible measure of salience. To investigate whether this approach is appropriate, we compare LIME, Integrated Gradients, DeepLIFT, Grad-SHAP, Deep-SHAP, and attention-based explanations, applied to two neural architectures trained on single- and pair-sequence language tasks. In most cases, we find that none of our chosen methods agree. Based on our empirical observations and theoretical objections, we conclude that rank correlation does not measure the quality of feature-additive methods. Practitioners should instead use the numerous and rigorous diagnostic methods proposed by the community.

Twitter Facebook LinkedIn

Stefan F. Schouten

Order in the Court: Explainable AI Methods Prone to Disagreement

Abstract

You May Also Enjoy

Reasoning about Ambiguous Definite Descriptions in EMNLP 2023 Findings

Cross-Domain Toxic Spans Detection @ NLDB 2023

Probing the representations of named entities in Transformer-based Language Models @ BlackboxNLP 2022

Project AI: Do Feature-Additive Explanation Methods Agree?