Related papers: Compressive Sensing and Neural Networks from a Statistical Learning Perspective

Compressive Sensing and Neural Networks from a Statistical Learning Perspective

URL: http://arxiv.org/abs/2010.15658v4
Date: Fri, 13 Aug 2021 09:28:31 GMT
Title: Compressive Sensing and Neural Networks from a Statistical Learning Perspective
Authors: Arash Behboodi, Holger Rauhut, Ekkehard Schnoor
Abstract summary: We present a generalization error analysis for a class of neural networks suitable for sparse reconstruction from few linear measurements. Under realistic conditions, the generalization error scales only logarithmically in the number of layers, and at most linear in number of measurements.
Score: 4.561032960211816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Various iterative reconstruction algorithms for inverse problems can be unfolded as neural networks. Empirically, this approach has often led to improved results, but theoretical guarantees are still scarce. While some progress on generalization properties of neural networks have been made, great challenges remain. In this chapter, we discuss and combine these topics to present a generalization error analysis for a class of neural networks suitable for sparse reconstruction from few linear measurements. The hypothesis class considered is inspired by the classical iterative soft-thresholding algorithm (ISTA). The neural networks in this class are obtained by unfolding iterations of ISTA and learning some of the weights. Based on training samples, we aim at learning the optimal network parameters via empirical risk minimization and thereby the optimal network that reconstructs signals from their compressive linear measurements. In particular, we may learn a sparsity basis that is shared by all of the iterations/layers and thereby obtain a new approach for dictionary learning. For this class of networks, we present a generalization bound, which is based on bounding the Rademacher complexity of hypothesis classes consisting of such deep networks via Dudley's integral. Remarkably, under realistic conditions, the generalization error scales only logarithmically in the number of layers, and at most linear in number of measurements.

Related papers

Fundamental limits of overparametrized shallow neural networks for supervised learning [11.136777922498355]
We study a two-layer neural network trained from input-output pairs generated by a teacher network with matching architecture. Our results come in the form of bounds relating i) the mutual information between training data and network weights, or ii) the Bayes-optimal generalization error.
arXiv Detail & Related papers (2023-07-11T08:30:50Z)
Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers [0.0]
We investigate the impact on deep network performances of a training approach. We propose a neural network architecture which induces an error function involving the outputs of all the network layers.
arXiv Detail & Related papers (2023-06-09T10:52:49Z)
When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning. Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z)
Generalization and Estimation Error Bounds for Model-based Neural Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks. We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
Generalization Error Bounds for Iterative Recovery Algorithms Unfolded as Neural Networks [6.173968909465726]
We introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements. By allowing a wide range of degrees of weight-sharing between the layers, we enable a unified analysis for very different neural network types.
arXiv Detail & Related papers (2021-12-08T16:17:33Z)
Critical Initialization of Wide and Deep Neural Networks through Partial Jacobians: General Theory and Applications [6.579523168465526]
We introduce emphpartial Jacobians of a network, defined as derivatives of preactivations in layer $l$ with respect to preactivations in layer $l_0leq l$. We derive recurrence relations for the norms of partial Jacobians and utilize these relations to analyze criticality of deep fully connected neural networks with LayerNorm and/or residual connections.
arXiv Detail & Related papers (2021-11-23T20:31:42Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective. We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.