Discovering and Explaining the Representation Bottleneck of DNNs
- URL: http://arxiv.org/abs/2111.06236v1
- Date: Thu, 11 Nov 2021 14:35:20 GMT
- Title: Discovering and Explaining the Representation Bottleneck of DNNs
- Authors: Huiqi Deng, Qihan Ren, Xu Chen, Hao Zhang, Jie Ren, Quanshi Zhang
- Abstract summary: This paper explores the bottleneck of feature representations of deep neural networks (DNNs)
We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions.
We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
- Score: 21.121270460158712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the bottleneck of feature representations of deep neural
networks (DNNs), from the perspective of the complexity of interactions between
input variables encoded in DNNs. To this end, we focus on the multi-order
interaction between input variables, where the order represents the complexity
of interactions. We discover that a DNN is more likely to encode both too
simple interactions and too complex interactions, but usually fails to learn
interactions of intermediate complexity. Such a phenomenon is widely shared by
different DNNs for different tasks. This phenomenon indicates a cognition gap
between DNNs and human beings, and we call it a representation bottleneck. We
theoretically prove the underlying reason for the representation bottleneck.
Furthermore, we propose a loss to encourage/penalize the learning of
interactions of specific complexities, and analyze the representation
capacities of interactions of different complexities.
Related papers
- Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions.
In this paper, we discover the DNN learns interactions in two phases.
The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z) - Defining and Extracting generalizable interaction primitives from DNNs [22.79131582164054]
We develop a new method to extract interactions that are shared by different deep neural networks (DNNs)
Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.
arXiv Detail & Related papers (2024-01-29T17:21:41Z) - Neural Amortized Inference for Nested Multi-agent Reasoning [54.39127942041582]
We propose a novel approach to bridge the gap between human-like inference capabilities and computational limitations.
We evaluate our method in two challenging multi-agent interaction domains.
arXiv Detail & Related papers (2023-08-21T22:40:36Z) - Technical Note: Defining and Quantifying AND-OR Interactions for Faithful and Concise Explanation of DNNs [24.099892982101398]
We aim to explain a deep neural network (DNN) by quantifying the encoded interactions between input variables.
Specifically, we first rethink the definition of interactions, and then formally define faithfulness and conciseness for interaction-based explanation.
arXiv Detail & Related papers (2023-04-26T06:33:31Z) - Explaining Generalization Power of a DNN Using Interactive Concepts [24.712192363947096]
This paper explains the generalization power of a deep neural network (DNN) from the perspective of interactions.
We also discover the detouring dynamics of learning complex concepts, which explains both the high learning difficulty and the low generalization power of complex concepts.
arXiv Detail & Related papers (2023-02-25T14:44:40Z) - Discovering the Representation Bottleneck of Graph Neural Networks from
Multi-order Interactions [51.597480162777074]
Graph neural networks (GNNs) rely on the message passing paradigm to propagate node features and build interactions.
Recent works point out that different graph learning tasks require different ranges of interactions between nodes.
We study two common graph construction methods in scientific domains, i.e., emphK-nearest neighbor (KNN) graphs and emphfully-connected (FC) graphs.
arXiv Detail & Related papers (2022-05-15T11:38:14Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Interpreting Multivariate Shapley Interactions in DNNs [33.67263820904767]
This paper aims to explain deep neural networks (DNNs) from the perspective of multivariate interactions.
In this paper, we define and quantify the significance of interactions among multiple input variables of the DNN.
arXiv Detail & Related papers (2020-10-10T17:02:51Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z) - Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs)
NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes.
Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.