Spectral Gap Regularization of Neural Networks
- URL: http://arxiv.org/abs/2304.03096v1
- Date: Thu, 6 Apr 2023 14:23:40 GMT
- Title: Spectral Gap Regularization of Neural Networks
- Authors: Edric Tam, David Dunson
- Abstract summary: Fiedler regularization is a novel approach for regularizing neural networks that utilizes spectral/graphical information.
We provide an approximate, variational approach for faster computation during training.
We performed experiments on datasets that compare Fiedler regularization with classical regularization methods such as dropout and weight decay.
- Score: 6.09170287691728
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Fiedler regularization, a novel approach for regularizing neural
networks that utilizes spectral/graphical information. Existing regularization
methods often focus on penalizing weights in a global/uniform manner that
ignores the connectivity structure of the neural network. We propose to use the
Fiedler value of the neural network's underlying graph as a tool for
regularization. We provide theoretical motivation for this approach via
spectral graph theory. We demonstrate several useful properties of the Fiedler
value that make it useful as a regularization tool. We provide an approximate,
variational approach for faster computation during training. We provide an
alternative formulation of this framework in the form of a structurally
weighted $\text{L}_1$ penalty, thus linking our approach to sparsity induction.
We provide uniform generalization error bounds for Fiedler regularization via a
Rademacher complexity analysis. We performed experiments on datasets that
compare Fiedler regularization with classical regularization methods such as
dropout and weight decay. Results demonstrate the efficacy of Fiedler
regularization. This is a journal extension of the conference paper by Tam and
Dunson (2020).
Related papers
- Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram
Iteration [122.51142131506639]
We introduce a precise, fast, and differentiable upper bound for the spectral norm of convolutional layers using circulant matrix theory.
We show through a comprehensive set of experiments that our approach outperforms other state-of-the-art methods in terms of precision, computational cost, and scalability.
It proves highly effective for the Lipschitz regularization of convolutional neural networks, with competitive results against concurrent approaches.
arXiv Detail & Related papers (2023-05-25T15:32:21Z) - Iterative regularization in classification via hinge loss diagonal descent [12.684351703991965]
Iterative regularization is a classic idea in regularization theory, that has recently become popular in machine learning.
In this paper, we focus on iterative regularization in the context of classification.
arXiv Detail & Related papers (2022-12-24T07:15:26Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Scaling Up Graph Neural Networks Via Graph Coarsening [18.176326897605225]
Scalability of graph neural networks (GNNs) is one of the major challenges in machine learning.
In this paper, we propose to use graph coarsening for scalable training of GNNs.
We show that, simply applying off-the-shelf coarsening methods, we can reduce the number of nodes by up to a factor of ten without causing a noticeable downgrade in classification accuracy.
arXiv Detail & Related papers (2021-06-09T15:46:17Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - On the Importance of Sampling in Learning Graph Convolutional Networks [13.713485304798368]
Graph Convolutional Networks (GCNs) have achieved impressive empirical advancement across a wide variety of graph-related applications.
Despite their success, training GCNs on large graphs suffers from computational and memory issues.
We describe and analyze a general textbftextitdoubly variance reduction schema that can accelerate any sampling method under the memory budget.
arXiv Detail & Related papers (2021-03-03T21:31:23Z) - Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring.
Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains.
The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z) - Fiedler Regularization: Learning Neural Networks with Graph Sparsity [6.09170287691728]
We introduce a novel regularization approach for deep learning that incorporates and respects the underlying graphical structure of the neural network.
We propose to use the Fiedler value of the neural network's underlying graph as a tool for regularization.
arXiv Detail & Related papers (2020-03-02T16:19:33Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.