Related papers: Discovering and Explaining the Representation Bottleneck of DNNs

Discovering and Explaining the Representation Bottleneck of DNNs

URL: http://arxiv.org/abs/2111.06236v1
Date: Thu, 11 Nov 2021 14:35:20 GMT
Title: Discovering and Explaining the Representation Bottleneck of DNNs
Authors: Huiqi Deng, Qihan Ren, Xu Chen, Hao Zhang, Jie Ren, Quanshi Zhang
Abstract summary: This paper explores the bottleneck of feature representations of deep neural networks (DNNs) We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions. We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
Score: 21.121270460158712
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper explores the bottleneck of feature representations of deep neural networks (DNNs), from the perspective of the complexity of interactions between input variables encoded in DNNs. To this end, we focus on the multi-order interaction between input variables, where the order represents the complexity of interactions. We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity. Such a phenomenon is widely shared by different DNNs for different tasks. This phenomenon indicates a cognition gap between DNNs and human beings, and we call it a representation bottleneck. We theoretically prove the underlying reason for the representation bottleneck. Furthermore, we propose a loss to encourage/penalize the learning of interactions of specific complexities, and analyze the representation capacities of interactions of different complexities.

Related papers

Randomness of Low-Layer Parameters Determines Confusing Samples in Terms of Interaction Representations of a DNN [67.80700786901016]
We find that the complexity of interactions encoded by a deep neural network (DNN) can explain its generalization power. We also discover that the confusing samples of a DNN, which are represented by non-generalizable interactions, are determined by its low-layer parameters.
arXiv Detail & Related papers (2025-02-12T18:25:13Z)
Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions. In this paper, we discover the DNN learns interactions in two phases. The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z)
Defining and Extracting generalizable interaction primitives from DNNs [22.79131582164054]
We develop a new method to extract interactions that are shared by different deep neural networks (DNNs) Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.
arXiv Detail & Related papers (2024-01-29T17:21:41Z)
Neural Amortized Inference for Nested Multi-agent Reasoning [54.39127942041582]
We propose a novel approach to bridge the gap between human-like inference capabilities and computational limitations. We evaluate our method in two challenging multi-agent interaction domains.
arXiv Detail & Related papers (2023-08-21T22:40:36Z)
Technical Note: Defining and Quantifying AND-OR Interactions for Faithful and Concise Explanation of DNNs [24.099892982101398]
We aim to explain a deep neural network (DNN) by quantifying the encoded interactions between input variables. Specifically, we first rethink the definition of interactions, and then formally define faithfulness and conciseness for interaction-based explanation.
arXiv Detail & Related papers (2023-04-26T06:33:31Z)
Explaining Generalization Power of a DNN Using Interactive Concepts [24.712192363947096]
This paper explains the generalization power of a deep neural network (DNN) from the perspective of interactions. We also discover the detouring dynamics of learning complex concepts, which explains both the high learning difficulty and the low generalization power of complex concepts.
arXiv Detail & Related papers (2023-02-25T14:44:40Z)
Discovering the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions [51.597480162777074]
Graph neural networks (GNNs) rely on the message passing paradigm to propagate node features and build interactions. Recent works point out that different graph learning tasks require different ranges of interactions between nodes. We study two common graph construction methods in scientific domains, i.e., emphK-nearest neighbor (KNN) graphs and emphfully-connected (FC) graphs.
arXiv Detail & Related papers (2022-05-15T11:38:14Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Interpreting Multivariate Shapley Interactions in DNNs [33.67263820904767]
This paper aims to explain deep neural networks (DNNs) from the perspective of multivariate interactions. In this paper, we define and quantify the significance of interactions among multiple input variables of the DNN.
arXiv Detail & Related papers (2020-10-10T17:02:51Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs) NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.