Boundary between noise and information applied to filtering neural
network weight matrices
- URL: http://arxiv.org/abs/2206.03927v1
- Date: Wed, 8 Jun 2022 14:42:36 GMT
- Title: Boundary between noise and information applied to filtering neural
network weight matrices
- Authors: Max Staats, Matthias Thamm, Bernd Rosenow
- Abstract summary: We introduce an algorithm for noise filtering, which both removes small singular values and reduces the magnitude of large singular values.
For networks trained in the presence of label noise, we indeed find that the generalization performance improves significantly due to noise filtering.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have been successfully applied to a broad range of
problems where overparametrization yields weight matrices which are partially
random. A comparison of weight matrix singular vectors to the Porter-Thomas
distribution suggests that there is a boundary between randomness and learned
information in the singular value spectrum. Inspired by this finding, we
introduce an algorithm for noise filtering, which both removes small singular
values and reduces the magnitude of large singular values to counteract the
effect of level repulsion between the noise and the information part of the
spectrum. For networks trained in the presence of label noise, we indeed find
that the generalization performance improves significantly due to noise
filtering.
Related papers
- On the Sample Complexity of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing [12.845681770287005]
Weight sharing, equivariance, and local filters, as in convolutional neural networks, are believed to contribute to the sample efficiency of neural networks.
We obtain lower and upper sample complexity bounds for a class of single hidden layer networks.
We show that the bound depends merely on the norm of filters, which is tighter than using the spectral norm of the respective matrix.
arXiv Detail & Related papers (2024-11-21T16:36:01Z) - Linear Attention Based Deep Nonlocal Means Filtering for Multiplicative Noise Removal [0.0]
Multiplicative noise widely exists in radar images, medical images and other important fields' images.
We linearize the nonlocal means algorithm with deep learning and propose a linear attention mechanism based deep nonlocal means filtering (LDNLM)
Experiments on both simulated and real multiplicative noise demonstrate that the LDNLM is more competitive compared with the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-06T14:22:07Z) - Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise [19.496063739638924]
We consider a saturate problem of Bayesian inference for a structured spiked model.
We show how to predict the statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer equations.
arXiv Detail & Related papers (2024-05-31T16:38:35Z) - Neural Network-augmented Kalman Filtering for Robust Online Speech
Dereverberation in Noisy Reverberant Environments [13.49645012479288]
A neural network-augmented algorithm for noise-robust online dereverberation is proposed.
The presented framework allows for robust dereverberation on a single-channel noisy reverberant dataset.
arXiv Detail & Related papers (2022-04-06T11:38:04Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Learning based signal detection for MIMO systems with unknown noise
statistics [84.02122699723536]
This paper aims to devise a generalized maximum likelihood (ML) estimator to robustly detect signals with unknown noise statistics.
In practice, there is little or even no statistical knowledge on the system noise, which in many cases is non-Gaussian, impulsive and not analyzable.
Our framework is driven by an unsupervised learning approach, where only the noise samples are required.
arXiv Detail & Related papers (2021-01-21T04:48:15Z) - Explicit Regularisation in Gaussian Noise Injections [64.11680298737963]
We study the regularisation induced in neural networks by Gaussian noise injections (GNIs)
We derive the explicit regulariser of GNIs, obtained by marginalising out the injected noise.
We show analytically and empirically that such regularisation produces calibrated classifiers with large classification margins.
arXiv Detail & Related papers (2020-07-14T21:29:46Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z) - Sparse Mixture of Local Experts for Efficient Speech Enhancement [19.645016575334786]
We investigate a deep learning approach for speech denoising through an efficient ensemble of specialist neural networks.
By splitting up the speech denoising task into non-overlapping subproblems, we are able to improve denoising performance while also reducing computational complexity.
Our findings demonstrate that a fine-tuned ensemble network is able to exceed the speech denoising capabilities of a generalist network.
arXiv Detail & Related papers (2020-05-16T23:23:22Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.