Sum-of-Parts: Faithful Attributions for Groups of Features
- URL: http://arxiv.org/abs/2310.16316v2
- Date: Wed, 02 Oct 2024 23:37:28 GMT
- Title: Sum-of-Parts: Faithful Attributions for Groups of Features
- Authors: Weiqiu You, Helen Qu, Marco Gatti, Bhuvnesh Jain, Eric Wong,
- Abstract summary: Sum-of-Parts ( SOP) is a framework that transforms any differentiable model into a self-explaining model whose predictions can be attributed to groups of features.
SOP achieves highest performance while also scoring high with respect to faithfulness metrics on ImageNet and CosmoGrid.
We validate the usefulness of the groups learned by SOP through their high purity, strong human distinction ability, and practical utility in scientific discovery.
- Score: 8.68707471649733
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature attributions explain machine learning predictions by assigning importance scores to input features. While faithful attributions accurately reflect feature contributions to the model's prediction, unfaithful ones can lead to misleading interpretations, making them unreliable in high-stake domains. The challenge of unfaithfulness of post-hoc attributions led to the development of self-explaining models. However, self-explaining models often trade-off performance for interpretability. In this work, we develop Sum-of-Parts (SOP), a new framework that transforms any differentiable model into a self-explaining model whose predictions can be attributed to groups of features. The SOP framework leverages pretrained deep learning models with custom attention modules to learn useful feature groups end-to-end without direct supervision. With these capabilities, SOP achieves highest performance while also scoring high with respect to faithfulness metrics on both ImageNet and CosmoGrid. We validate the usefulness of the groups learned by SOP through their high purity, strong human distinction ability, and practical utility in scientific discovery. In a case study, we show how SOP assists cosmologists in uncovering new insights about galaxy formation.
Related papers
- Semi-KAN: KAN Provides an Effective Representation for Semi-Supervised Learning in Medical Image Segmentation [2.717521115234258]
Semi-supervised medical image segmentation (SSMIS) offers a viable alternative to CNNs and ViTs.
Inspired by Kolmogorov-Arnold Networks (KANs), we propose Semi-KAN.
KANs exhibit superior representation learning capabilities with fewer parameters.
We show that Semi-KAN surpasses baseline networks, utilizing fewer KAN layers and lower computational cost.
arXiv Detail & Related papers (2025-03-19T08:27:41Z) - Explainability of Highly Associated Fuzzy Churn Patterns in Binary Classification [21.38368444137596]
This study emphasizes the importance of identifying multivariate patterns and setting soft bounds for intuitive interpretation.
The main objective is to use a machine learning model and fuzzy-set theory with top-textitk HUIM to identify highly associated patterns of customer churn.
As a result, the study introduces an innovative approach that improves the explainability and effectiveness of customer churn prediction models.
arXiv Detail & Related papers (2024-10-21T09:44:37Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Post Hoc Explanations of Language Models Can Improve Language Models [43.2109029463221]
We present a novel framework, Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (AMPLIFY)
We leverage post hoc explanation methods which output attribution scores (explanations) capturing the influence of each of the input features on model predictions.
Our framework, AMPLIFY, leads to prediction accuracy improvements of about 10-25% over a wide range of tasks.
arXiv Detail & Related papers (2023-05-19T04:46:04Z) - NxPlain: Web-based Tool for Discovery of Latent Concepts [16.446370662629555]
We present NxPlain, a web application that provides an explanation of a model's prediction using latent concepts.
NxPlain discovers latent concepts learned in a deep NLP model, provides an interpretation of the knowledge learned in the model, and explains its predictions based on the used concepts.
arXiv Detail & Related papers (2023-03-06T10:45:24Z) - Measuring the Driving Forces of Predictive Performance: Application to
Credit Scoring [0.0]
In credit scoring, machine learning models are known to outperform standard parametric models.
We introduce the XPER methodology to decompose a performance metric into contributions associated with a model.
We show that a small number of features can explain a surprisingly large part of the model performance.
arXiv Detail & Related papers (2022-12-12T13:09:46Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Discover, Explanation, Improvement: An Automatic Slice Detection
Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints.
This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks.
Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z) - Equivariant Transduction through Invariant Alignment [71.45263447328374]
We introduce a novel group-equivariant architecture that incorporates a group-in hard alignment mechanism.
We find that our network's structure allows it to develop stronger equivariant properties than existing group-equivariant approaches.
We additionally find that it outperforms previous group-equivariant networks empirically on the SCAN task.
arXiv Detail & Related papers (2022-09-22T11:19:45Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Interpreting and improving deep-learning models with reality checks [13.287382944078562]
This chapter covers recent work aiming to interpret models by attributing importance to features and feature groups for a single prediction.
We show how these attributions can be used to directly improve the generalization of a neural network or to distill it into a simple model.
arXiv Detail & Related papers (2021-08-16T00:58:15Z) - Subgroup Generalization and Fairness of Graph Neural Networks [12.88476464580968]
We present a novel PAC-Bayesian analysis for GNNs under a non-IID semi-supervised learning setup.
We further study an accuracy-(dis)parity-style (un)fairness of GNNs from a theoretical perspective.
arXiv Detail & Related papers (2021-06-29T16:13:41Z) - Group Equivariant Neural Architecture Search via Group Decomposition and
Reinforcement Learning [17.291131923335918]
We prove a new group-theoretic result in the context of equivariant neural networks.
We also design an algorithm to construct equivariant networks that significantly improves computational complexity.
We use deep Q-learning to search for group equivariant networks that maximize performance.
arXiv Detail & Related papers (2021-04-10T19:37:25Z) - Contextual Classification Using Self-Supervised Auxiliary Models for
Deep Neural Networks [6.585049648605185]
We introduce the notion of Self-Supervised Autogenous Learning (SSAL) models.
A SSAL objective is realized through one or more additional targets that are derived from the original supervised classification task.
We show that SSAL models consistently outperform the state-of-the-art while also providing structured predictions that are more interpretable.
arXiv Detail & Related papers (2021-01-07T18:41:16Z) - Ensembles of Spiking Neural Networks [0.3007949058551534]
This paper demonstrates how to construct ensembles of spiking neural networks producing state-of-the-art results.
We achieve classification accuracies of 98.71%, 100.0%, and 99.09%, on the MNIST, NMNIST and DVS Gesture datasets respectively.
We formalize spiking neural networks as GLM predictors, identifying a suitable representation for their target domain.
arXiv Detail & Related papers (2020-10-15T17:45:18Z) - Interpretable Learning-to-Rank with Generalized Additive Models [78.42800966500374]
Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area.
Recent progress on interpretable ranking models largely focuses on generating post-hoc explanations for existing black-box ranking models.
We lay the groundwork for intrinsically interpretable learning-to-rank by introducing generalized additive models (GAMs) into ranking tasks.
arXiv Detail & Related papers (2020-05-06T01:51:30Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.