Explaining Generalization Power of a DNN Using Interactive Concepts
- URL: http://arxiv.org/abs/2302.13091v2
- Date: Fri, 13 Sep 2024 09:19:14 GMT
- Title: Explaining Generalization Power of a DNN Using Interactive Concepts
- Authors: Huilin Zhou, Hao Zhang, Huiqi Deng, Dongrui Liu, Wen Shen, Shih-Han Chan, Quanshi Zhang,
- Abstract summary: This paper explains the generalization power of a deep neural network (DNN) from the perspective of interactions.
We also discover the detouring dynamics of learning complex concepts, which explains both the high learning difficulty and the low generalization power of complex concepts.
- Score: 24.712192363947096
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explains the generalization power of a deep neural network (DNN) from the perspective of interactions. Although there is no universally accepted definition of the concepts encoded by a DNN, the sparsity of interactions in a DNN has been proved, i.e., the output score of a DNN can be well explained by a small number of interactions between input variables. In this way, to some extent, we can consider such interactions as interactive concepts encoded by the DNN. Therefore, in this paper, we derive an analytic explanation of inconsistency of concepts of different complexities. This may shed new lights on using the generalization power of concepts to explain the generalization power of the entire DNN. Besides, we discover that the DNN with stronger generalization power usually learns simple concepts more quickly and encodes fewer complex concepts. We also discover the detouring dynamics of learning complex concepts, which explains both the high learning difficulty and the low generalization power of complex concepts. The code will be released when the paper is accepted.
Related papers
- Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions.
In this paper, we discover the DNN learns interactions in two phases.
The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z) - Defining and Extracting generalizable interaction primitives from DNNs [22.79131582164054]
We develop a new method to extract interactions that are shared by different deep neural networks (DNNs)
Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.
arXiv Detail & Related papers (2024-01-29T17:21:41Z) - Technical Note: Defining and Quantifying AND-OR Interactions for Faithful and Concise Explanation of DNNs [24.099892982101398]
We aim to explain a deep neural network (DNN) by quantifying the encoded interactions between input variables.
Specifically, we first rethink the definition of interactions, and then formally define faithfulness and conciseness for interaction-based explanation.
arXiv Detail & Related papers (2023-04-26T06:33:31Z) - Bayesian Neural Networks Avoid Encoding Complex and
Perturbation-Sensitive Concepts [22.873523599349326]
In this paper, we focus on mean-field variational Bayesian Neural Networks (BNNs) and explore the representation capacity of such BNNs.
It has been observed and studied that a relatively small set of interactive concepts usually emerge in the knowledge representation of a sufficiently-trained neural network.
Our study proves that compared to standard deep neural networks (DNNs), it is less likely for BNNs to encode complex concepts.
arXiv Detail & Related papers (2023-02-25T14:56:35Z) - Does a Neural Network Really Encode Symbolic Concepts? [24.099892982101398]
In this paper, we examine the trustworthiness of interaction concepts from four perspectives.
Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts.
arXiv Detail & Related papers (2023-02-25T13:58:37Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Discovering and Explaining the Representation Bottleneck of DNNs [21.121270460158712]
This paper explores the bottleneck of feature representations of deep neural networks (DNNs)
We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions.
We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
arXiv Detail & Related papers (2021-11-11T14:35:20Z) - GNN is a Counter? Revisiting GNN for Question Answering [105.8253992750951]
State-of-the-art Question Answering (QA) systems commonly use pre-trained language models (LMs) to access knowledge encoded in LMs.
elaborately designed modules based on Graph Neural Networks (GNNs) to perform reasoning over knowledge graphs (KGs)
Our work reveals that existing knowledge-aware GNN modules may only carry out some simple reasoning such as counting.
arXiv Detail & Related papers (2021-10-07T05:44:52Z) - A Practical Tutorial on Graph Neural Networks [49.919443059032226]
Graph neural networks (GNNs) have recently grown in popularity in the field of artificial intelligence (AI)
This tutorial exposes the power and novelty of GNNs to AI practitioners.
arXiv Detail & Related papers (2020-10-11T12:36:17Z) - Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs)
NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes.
Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.