How do Minimum-Norm Shallow Denoisers Look in Function Space?
- URL: http://arxiv.org/abs/2311.06748v2
- Date: Tue, 16 Jan 2024 08:35:30 GMT
- Title: How do Minimum-Norm Shallow Denoisers Look in Function Space?
- Authors: Chen Zeno, Greg Ongie, Yaniv Blumenfeld, Nir Weinberger, Daniel Soudry
- Abstract summary: Neural network (NN) denoisers are an essential building block in many common tasks.
We characterize the functions realized by shallow ReLU NN denoisers with a minimal representation cost.
- Score: 36.14517933550934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network (NN) denoisers are an essential building block in many common
tasks, ranging from image reconstruction to image generation. However, the
success of these models is not well understood from a theoretical perspective.
In this paper, we aim to characterize the functions realized by shallow ReLU NN
denoisers -- in the common theoretical setting of interpolation (i.e., zero
training loss) with a minimal representation cost (i.e., minimal $\ell^2$ norm
weights). First, for univariate data, we derive a closed form for the NN
denoiser function, find it is contractive toward the clean data points, and
prove it generalizes better than the empirical MMSE estimator at a low noise
level. Next, for multivariate data, we find the NN denoiser functions in a
closed form under various geometric assumptions on the training data: data
contained in a low-dimensional subspace, data contained in a union of one-sided
rays, or several types of simplexes. These functions decompose into a sum of
simple rank-one piecewise linear interpolations aligned with edges and/or faces
connecting training samples. We empirically verify this alignment phenomenon on
synthetic data and real images.
Related papers
- Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias
for Correlated Inputs [5.7166378791349315]
We prove that, for the fundamental regression task of learning a single neuron, training a one-hidden layer ReLU network converges to zero loss.
We also show and characterise a surprising distinction in this setting between interpolator networks of minimal rank and those of minimal Euclidean norm.
arXiv Detail & Related papers (2023-06-10T16:36:22Z) - Score-based Diffusion Models in Function Space [137.70916238028306]
Diffusion models have recently emerged as a powerful framework for generative modeling.
This work introduces a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.
We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Noise Self-Regression: A New Learning Paradigm to Enhance Low-Light Images Without Task-Related Data [86.68013790656762]
We propose Noise SElf-Regression (NoiSER) without access to any task-related data.
NoiSER is highly competitive in enhancement quality, yet with a much smaller model size, and much lower training and inference cost.
arXiv Detail & Related papers (2022-11-09T06:18:18Z) - Benign Overfitting without Linearity: Neural Network Classifiers Trained
by Gradient Descent for Noisy Linear Data [44.431266188350655]
We consider the generalization error of two-layer neural networks trained to generalize by gradient descent.
We show that neural networks exhibit benign overfitting: they can be driven to zero training error, perfectly fitting any noisy training labels, and simultaneously achieve minimax optimal test error.
In contrast to previous work on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynamics are fundamentally nonlinear.
arXiv Detail & Related papers (2022-02-11T23:04:00Z) - Deep learning for inverse problems with unknown operator [0.0]
In inverse problems where the forward operator $T$ is unknown, we have access to training data consisting of functions $f_i$ and their noisy images $Tf_i$.
We propose a new method that requires minimal assumptions on the data, and prove reconstruction rates that depend on the number of training points and the noise level.
arXiv Detail & Related papers (2021-08-05T17:21:12Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Implicit Geometric Regularization for Learning Shapes [34.052738965233445]
We offer a new paradigm for computing high fidelity implicit neural representations directly from raw data.
We show that our method leads to state of the art implicit neural representations with higher level-of-details and fidelity compared to previous methods.
arXiv Detail & Related papers (2020-02-24T07:36:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.