Related papers: Technical Report: Quantifying and Analyzing the Generalization Power of a DNN

Technical Report: Quantifying and Analyzing the Generalization Power of a DNN

URL: http://arxiv.org/abs/2505.06993v2
Date: Tue, 20 May 2025 15:25:12 GMT
Title: Technical Report: Quantifying and Analyzing the Generalization Power of a DNN
Authors: Yuxuan He, Junpeng Zhang, Lei Cheng, Hongyuan Zhang, Quanshi Zhang,
Abstract summary: This paper proposes a new perspective for analyzing the generalization power of deep neural networks (DNNs)<n>We show that the learning of non-generalizable interactions is the the direct cause for the gap between the training and testing losses.
Score: 32.57423950987925
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a new perspective for analyzing the generalization power of deep neural networks (DNNs), i.e., directly disentangling and analyzing the dynamics of generalizable and non-generalizable interaction encoded by a DNN through the training process. Specifically, this work builds upon the recent theoretical achievement in explainble AI, which proves that the detailed inference logic of DNNs can be can be strictly rewritten as a small number of AND-OR interaction patterns. Based on this, we propose an efficient method to quantify the generalization power of each interaction, and we discover a distinct three-phase dynamics of the generalization power of interactions during training. In particular, the early phase of training typically removes noisy and non-generalizable interactions and learns simple and generalizable ones. The second and the third phases tend to capture increasingly complex interactions that are harder to generalize. Experimental results verify that the learning of non-generalizable interactions is the the direct cause for the gap between the training and testing losses.

Related papers

Revisiting Generalization Power of a DNN in Terms of Symbolic Interactions [24.77155634841885]
This paper aims to analyze the generalization power of deep neural networks (DNNs) from the perspective of interactions.<n>We find that the generalization power of a DNN can be explained as the generalization power of the interactions.
arXiv Detail & Related papers (2025-02-14T13:46:14Z)
Towards Bridging Generalization and Expressivity of Graph Neural Networks [11.560730203511111]
We study the relationship between expressivity and generalization in graph neural networks (GNNs) We introduce a novel framework that connects GNN generalization to the variance in graph structures they can capture. We uncover a trade-off between intra-class concentration and inter-class separation, both of which are crucial for effective generalization.
arXiv Detail & Related papers (2024-10-14T00:31:16Z)
Towards the Dynamics of a DNN Learning Symbolic Interactions [20.493304123269446]
A series of theorems have been proven in recent years to show that for a given input sample, a small set of interactions between input variables can be considered as primitive inference patterns. This study proves the two-phase dynamics of interactions, providing a theoretical mechanism for how the generalization power of a DNN changes during the training process.
arXiv Detail & Related papers (2024-07-27T07:34:49Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions. In this paper, we discover the DNN learns interactions in two phases. The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z)
Defining and Extracting generalizable interaction primitives from DNNs [22.79131582164054]
We develop a new method to extract interactions that are shared by different deep neural networks (DNNs) Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.
arXiv Detail & Related papers (2024-01-29T17:21:41Z)
Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training. We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z)
What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization. We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks. Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z)
What training reveals about neural network complexity [80.87515604428346]
This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training. Our results support the hypothesis that good training behavior can be a useful bias towards good generalization.
arXiv Detail & Related papers (2021-06-08T08:58:00Z)
On Connections between Regularizations for Improving DNN Robustness [67.28077776415724]
This paper analyzes regularization terms proposed recently for improving the adversarial robustness of deep neural networks (DNNs) We study possible connections between several effective methods, including input-gradient regularization, Jacobian regularization, curvature regularization, and a cross-Lipschitz functional.
arXiv Detail & Related papers (2020-07-04T23:43:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.