Related papers: Fiedler Regularization: Learning Neural Networks with Graph Sparsity

Fiedler Regularization: Learning Neural Networks with Graph Sparsity

URL: http://arxiv.org/abs/2003.00992v3
Date: Sat, 15 Aug 2020 08:39:03 GMT
Title: Fiedler Regularization: Learning Neural Networks with Graph Sparsity
Authors: Edric Tam and David Dunson
Abstract summary: We introduce a novel regularization approach for deep learning that incorporates and respects the underlying graphical structure of the neural network. We propose to use the Fiedler value of the neural network's underlying graph as a tool for regularization.
Score: 6.09170287691728
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a novel regularization approach for deep learning that incorporates and respects the underlying graphical structure of the neural network. Existing regularization methods often focus on dropping/penalizing weights in a global manner that ignores the connectivity structure of the neural network. We propose to use the Fiedler value of the neural network's underlying graph as a tool for regularization. We provide theoretical support for this approach via spectral graph theory. We list several useful properties of the Fiedler value that makes it suitable in regularization. We provide an approximate, variational approach for fast computation in practical training of neural networks. We provide bounds on such approximations. We provide an alternative but equivalent formulation of this framework in the form of a structurally weighted L1 penalty, thus linking our approach to sparsity induction. We performed experiments on datasets that compare Fiedler regularization with traditional regularization methods such as dropout and weight decay. Results demonstrate the efficacy of Fiedler regularization.

Related papers

Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks [27.29463801531576]
We provide convergence analysis for training orthonormal deep linear neural networks. Our results shed light on how increasing the number of hidden layers can impact the convergence speed.
arXiv Detail & Related papers (2023-11-24T18:46:54Z)
Spectral Gap Regularization of Neural Networks [6.09170287691728]
Fiedler regularization is a novel approach for regularizing neural networks that utilizes spectral/graphical information. We provide an approximate, variational approach for faster computation during training. We performed experiments on datasets that compare Fiedler regularization with classical regularization methods such as dropout and weight decay.
arXiv Detail & Related papers (2023-04-06T14:23:40Z)
Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights. We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z)
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions. We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis. We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z)
Fast Adaptation with Linearized Neural Networks [35.43406281230279]
We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation.
arXiv Detail & Related papers (2021-03-02T03:23:03Z)
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time. We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both. Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z)
Compressive Sensing and Neural Networks from a Statistical Learning Perspective [4.561032960211816]
We present a generalization error analysis for a class of neural networks suitable for sparse reconstruction from few linear measurements. Under realistic conditions, the generalization error scales only logarithmically in the number of layers, and at most linear in number of measurements.
arXiv Detail & Related papers (2020-10-29T15:05:43Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
Volumization as a Natural Generalization of Weight Decay [25.076488081589403]
Inspired by physics, we define a physical volume for the weight parameters in neural networks. We show that this method is an effective way of regularizing neural networks.
arXiv Detail & Related papers (2020-03-25T07:13:55Z)
Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights. Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)
Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective. We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.