Emergence of Concepts in DNNs?
- URL: http://arxiv.org/abs/2211.06137v1
- Date: Fri, 11 Nov 2022 11:25:39 GMT
- Title: Emergence of Concepts in DNNs?
- Authors: Tim R\"az
- Abstract summary: It is examined, first, how existing methods actually identify concepts that are supposedly represented in DNNs.
Second, it is discussed how conceptual spaces are shaped by a tradeoff between predictive accuracy and compression.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The present paper reviews and discusses work from computer science that
proposes to identify concepts in internal representations (hidden layers) of
DNNs. It is examined, first, how existing methods actually identify concepts
that are supposedly represented in DNNs. Second, it is discussed how conceptual
spaces -- sets of concepts in internal representations -- are shaped by a
tradeoff between predictive accuracy and compression. These issues are
critically examined by drawing on philosophy. While there is evidence that DNNs
able to represent non-trivial inferential relations between concepts, our
ability to identify concepts is severely limited.
Related papers
- Implicit Concept Removal of Diffusion Models [92.55152501707995]
Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images.
We present the Geom-Erasing, a novel concept removal method based on the geometric-driven control.
arXiv Detail & Related papers (2023-10-09T17:13:10Z) - Interpretable Neural-Symbolic Concept Reasoning [7.1904050674791185]
Concept-based models aim to address this issue by learning tasks based on a set of human-understandable concepts.
We propose the Deep Concept Reasoner (DCR), the first interpretable concept-based model that builds upon concept embeddings.
arXiv Detail & Related papers (2023-04-27T09:58:15Z) - Bayesian Neural Networks Avoid Encoding Complex and
Perturbation-Sensitive Concepts [22.873523599349326]
In this paper, we focus on mean-field variational Bayesian Neural Networks (BNNs) and explore the representation capacity of such BNNs.
It has been observed and studied that a relatively small set of interactive concepts usually emerge in the knowledge representation of a sufficiently-trained neural network.
Our study proves that compared to standard deep neural networks (DNNs), it is less likely for BNNs to encode complex concepts.
arXiv Detail & Related papers (2023-02-25T14:56:35Z) - Does a Neural Network Really Encode Symbolic Concepts? [24.099892982101398]
In this paper, we examine the trustworthiness of interaction concepts from four perspectives.
Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts.
arXiv Detail & Related papers (2023-02-25T13:58:37Z) - Concept Activation Regions: A Generalized Framework For Concept-Based
Explanations [95.94432031144716]
Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the deep neural network's latent space.
In this work, we propose allowing concept examples to be scattered across different clusters in the DNN's latent space.
This concept activation region (CAR) formalism yields global concept-based explanations and local concept-based feature importance.
arXiv Detail & Related papers (2022-09-22T17:59:03Z) - Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts.
We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions.
We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z) - Concept Evolution in Deep Learning Training: A Unified Interpretation
Framework and Discoveries [45.88354622464973]
ConceptEvo is a unified interpretation framework for deep neural networks (DNNs)
It reveals the inception and evolution of learned concepts during training.
It is applicable to both modern DNN architectures, such as ConvNeXt, and classic DNNs, such as VGGs and InceptionV3.
arXiv Detail & Related papers (2022-03-30T17:12:18Z) - Kernelized Concept Erasure [108.65038124096907]
We propose a kernelization of a linear minimax game for concept erasure.
It is possible to prevent specific non-linear adversaries from predicting the concept.
However, the protection does not transfer to different nonlinear adversaries.
arXiv Detail & Related papers (2022-01-28T15:45:13Z) - Towards Fully Interpretable Deep Neural Networks: Are We There Yet? [17.88784870849724]
Deep Neural Networks (DNNs) behave as black-boxes hindering user trust in Artificial Intelligence (AI) systems.
This paper provides a review of existing methods to develop DNNs with intrinsic interpretability.
arXiv Detail & Related papers (2021-06-24T16:37:34Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.