CoSy: Evaluating Textual Explanations of Neurons
- URL: http://arxiv.org/abs/2405.20331v2
- Date: Thu, 05 Dec 2024 15:48:24 GMT
- Title: CoSy: Evaluating Textual Explanations of Neurons
- Authors: Laura Kopf, Philine Lou Bommer, Anna Hedström, Sebastian Lapuschkin, Marina M. -C. Höhne, Kirill Bykov,
- Abstract summary: We introduce CoSy, a framework for evaluating textual explanations of latent neurons.
By comparing the neuron's response to generated data points and control data points, we can estimate the quality of the explanation.
We validate our framework through sanity checks and benchmark various neuron description methods for Computer Vision tasks.
- Score: 5.696573924249008
- License:
- Abstract: A crucial aspect of understanding the complex nature of Deep Neural Networks (DNNs) is the ability to explain learned concepts within their latent representations. While methods exist to connect neurons to human-understandable textual descriptions, evaluating the quality of these explanations is challenging due to the lack of a unified quantitative approach. We introduce CoSy (Concept Synthesis), a novel, architecture-agnostic framework for evaluating textual explanations of latent neurons. Given textual explanations, our proposed framework uses a generative model conditioned on textual input to create data points representing the explanations. By comparing the neuron's response to these generated data points and control data points, we can estimate the quality of the explanation. We validate our framework through sanity checks and benchmark various neuron description methods for Computer Vision tasks, revealing significant differences in quality.
Related papers
- Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities.
We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities.
We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z) - Less is More: Discovering Concise Network Explanations [26.126343100127936]
We introduce Discovering Conceptual Network Explanations (DCNE), a new approach for generating human-comprehensible visual explanations.
Our method automatically finds visual explanations that are critical for discriminating between classes.
DCNE represents a step forward in making neural network decisions accessible and interpretable to humans.
arXiv Detail & Related papers (2024-05-24T06:10:23Z) - Towards Generating Informative Textual Description for Neurons in
Language Models [6.884227665279812]
We propose a framework that ties textual descriptions to neurons.
In particular, our experiment shows that the proposed approach achieves 75% precision@2, and 50% recall@2
arXiv Detail & Related papers (2024-01-30T04:06:25Z) - NeuroExplainer: Fine-Grained Attention Decoding to Uncover Cortical
Development Patterns of Preterm Infants [73.85768093666582]
We propose an explainable geometric deep network dubbed NeuroExplainer.
NeuroExplainer is used to uncover altered infant cortical development patterns associated with preterm birth.
arXiv Detail & Related papers (2023-01-01T12:48:12Z) - Synergistic information supports modality integration and flexible
learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks.
Results show that synergy increases as neural networks learn multiple diverse tasks.
randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z) - Formal Conceptual Views in Neural Networks [0.0]
We introduce two notions for conceptual views of a neural network, specifically a many-valued and a symbolic view.
We test the conceptual expressivity of our novel views through different experiments on the ImageNet and Fruit-360 data sets.
We demonstrate how conceptual views can be applied for abductive learning of human comprehensible rules from neurons.
arXiv Detail & Related papers (2022-09-27T16:38:24Z) - Global Concept-Based Interpretability for Graph Neural Networks via
Neuron Analysis [0.0]
Graph neural networks (GNNs) are highly effective on a variety of graph-related tasks.
They lack interpretability and transparency.
Current explainability approaches are typically local and treat GNNs as black-boxes.
We propose a novel approach for producing global explanations for GNNs using neuron-level concepts.
arXiv Detail & Related papers (2022-08-22T21:30:55Z) - Natural Language Descriptions of Deep Visual Features [50.270035018478666]
We introduce a procedure that automatically labels neurons with open-ended, compositional, natural language descriptions.
We use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models.
We also use MILAN for auditing, surfacing neurons sensitive to protected categories like race and gender in models trained on datasets intended to obscure these features.
arXiv Detail & Related papers (2022-01-26T18:48:02Z) - Generalizable Neuro-symbolic Systems for Commonsense Question Answering [67.72218865519493]
This chapter illustrates how suitable neuro-symbolic models for language understanding can enable domain generalizability and robustness in downstream tasks.
Different methods for integrating neural language models and knowledge graphs are discussed.
arXiv Detail & Related papers (2022-01-17T06:13:37Z) - Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts.
We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.