Compressive Sensing and Neural Networks from a Statistical Learning
Perspective
- URL: http://arxiv.org/abs/2010.15658v4
- Date: Fri, 13 Aug 2021 09:28:31 GMT
- Title: Compressive Sensing and Neural Networks from a Statistical Learning
Perspective
- Authors: Arash Behboodi, Holger Rauhut, Ekkehard Schnoor
- Abstract summary: We present a generalization error analysis for a class of neural networks suitable for sparse reconstruction from few linear measurements.
Under realistic conditions, the generalization error scales only logarithmically in the number of layers, and at most linear in number of measurements.
- Score: 4.561032960211816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Various iterative reconstruction algorithms for inverse problems can be
unfolded as neural networks. Empirically, this approach has often led to
improved results, but theoretical guarantees are still scarce. While some
progress on generalization properties of neural networks have been made, great
challenges remain. In this chapter, we discuss and combine these topics to
present a generalization error analysis for a class of neural networks suitable
for sparse reconstruction from few linear measurements. The hypothesis class
considered is inspired by the classical iterative soft-thresholding algorithm
(ISTA). The neural networks in this class are obtained by unfolding iterations
of ISTA and learning some of the weights. Based on training samples, we aim at
learning the optimal network parameters via empirical risk minimization and
thereby the optimal network that reconstructs signals from their compressive
linear measurements. In particular, we may learn a sparsity basis that is
shared by all of the iterations/layers and thereby obtain a new approach for
dictionary learning. For this class of networks, we present a generalization
bound, which is based on bounding the Rademacher complexity of hypothesis
classes consisting of such deep networks via Dudley's integral. Remarkably,
under realistic conditions, the generalization error scales only
logarithmically in the number of layers, and at most linear in number of
measurements.
Related papers
- Fundamental limits of overparametrized shallow neural networks for
supervised learning [11.136777922498355]
We study a two-layer neural network trained from input-output pairs generated by a teacher network with matching architecture.
Our results come in the form of bounds relating i) the mutual information between training data and network weights, or ii) the Bayes-optimal generalization error.
arXiv Detail & Related papers (2023-07-11T08:30:50Z) - Hidden Classification Layers: Enhancing linear separability between
classes in neural networks layers [0.0]
We investigate the impact on deep network performances of a training approach.
We propose a neural network architecture which induces an error function involving the outputs of all the network layers.
arXiv Detail & Related papers (2023-06-09T10:52:49Z) - When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning.
Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Generalization Error Bounds for Iterative Recovery Algorithms Unfolded
as Neural Networks [6.173968909465726]
We introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements.
By allowing a wide range of degrees of weight-sharing between the layers, we enable a unified analysis for very different neural network types.
arXiv Detail & Related papers (2021-12-08T16:17:33Z) - Critical Initialization of Wide and Deep Neural Networks through Partial
Jacobians: General Theory and Applications [6.579523168465526]
We introduce emphpartial Jacobians of a network, defined as derivatives of preactivations in layer $l$ with respect to preactivations in layer $l_0leq l$.
We derive recurrence relations for the norms of partial Jacobians and utilize these relations to analyze criticality of deep fully connected neural networks with LayerNorm and/or residual connections.
arXiv Detail & Related papers (2021-11-23T20:31:42Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.