C-SENN: Contrastive Self-Explaining Neural Network
- URL: http://arxiv.org/abs/2206.09575v1
- Date: Mon, 20 Jun 2022 05:23:02 GMT
- Title: C-SENN: Contrastive Self-Explaining Neural Network
- Authors: Yoshihide Sawada, Keigo Nakamura
- Abstract summary: This study combines contrastive learning with concept learning to improve the readability of concepts and the accuracy of tasks.
We call this model Contrastive Self-Explaining Neural Network (C-SENN)
- Score: 0.5939410304994348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we use a self-explaining neural network (SENN), which learns
unsupervised concepts, to acquire concepts that are easy for people to
understand automatically. In concept learning, the hidden layer retains
verbalizable features relevant to the output, which is crucial when adapting to
real-world environments where explanations are required. However, it is known
that the interpretability of concepts output by SENN is reduced in general
settings, such as autonomous driving scenarios. Thus, this study combines
contrastive learning with concept learning to improve the readability of
concepts and the accuracy of tasks. We call this model Contrastive
Self-Explaining Neural Network (C-SENN).
Related papers
- LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions [15.381209058506078]
Prior works have associated concepts with neurons based on examples of concepts or a pre-defined set of concepts.
We propose to leverage multimodal large language models for automatic and open-ended concept discovery.
We validate each concept by generating examples and counterexamples and evaluating the neuron's response on this new set of images.
arXiv Detail & Related papers (2024-06-12T18:19:37Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Q-SENN: Quantized Self-Explaining Neural Networks [24.305850756291246]
Self-Explaining Neural Networks (SENN) extract interpretable concepts with fidelity, diversity, and grounding to combine them linearly for decision-making.
We propose the Quantized-Self-Explaining Neural Network Q-SENN.
Q-SENN satisfies or exceeds the desiderata of SENN while being applicable to more complex datasets.
arXiv Detail & Related papers (2023-12-21T13:39:18Z) - OC-NMN: Object-centric Compositional Neural Module Network for
Generative Visual Analogical Reasoning [49.12350554270196]
We show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination.
Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language.
arXiv Detail & Related papers (2023-10-28T20:12:58Z) - From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks [15.837316393474403]
Concepts can act as a natural link between learning and reasoning.
Knowledge can not only be extracted from neural networks but concept knowledge can also be inserted into neural network architectures.
arXiv Detail & Related papers (2023-10-18T11:08:02Z) - Seeing in Words: Learning to Classify through Language Bottlenecks [59.97827889540685]
Humans can explain their predictions using succinct and intuitive descriptions.
We show that a vision model whose feature representations are text can effectively classify ImageNet images.
arXiv Detail & Related papers (2023-06-29T00:24:42Z) - Interpreting Neural Policies with Disentangled Tree Representations [58.769048492254555]
We study interpretability of compact neural policies through the lens of disentangled representation.
We leverage decision trees to obtain factors of variation for disentanglement in robot learning.
We introduce interpretability metrics that measure disentanglement of learned neural dynamics.
arXiv Detail & Related papers (2022-10-13T01:10:41Z) - Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction.
This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z) - Human-Centered Concept Explanations for Neural Networks [47.71169918421306]
We introduce concept explanations including the class of Concept Activation Vectors (CAV)
We then discuss approaches to automatically extract concepts, and approaches to address some of their caveats.
Finally, we discuss some case studies that showcase the utility of such concept-based explanations in synthetic settings and real world applications.
arXiv Detail & Related papers (2022-02-25T01:27:31Z) - Learning Semantically Meaningful Features for Interpretable
Classifications [17.88784870849724]
SemCNN learns associations between visual features and word phrases.
Experiment results on multiple benchmark datasets demonstrate that SemCNN can learn features with clear semantic meaning.
arXiv Detail & Related papers (2021-01-11T14:35:16Z) - A neural network model of perception and reasoning [0.0]
We show that a simple set of biologically consistent organizing principles confer these capabilities to neuronal networks.
We implement these principles in a novel machine learning algorithm, based on concept construction instead of optimization, to design deep neural networks that reason with explainable neuron activity.
arXiv Detail & Related papers (2020-02-26T06:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.