Related papers: Pivotal Auto-Encoder via Self-Normalizing ReLU

Pivotal Auto-Encoder via Self-Normalizing ReLU

URL: http://arxiv.org/abs/2406.16052v1
Date: Sun, 23 Jun 2024 09:06:52 GMT
Title: Pivotal Auto-Encoder via Self-Normalizing ReLU
Authors: Nelson Goldenstein, Jeremias Sulam, Yaniv Romano,
Abstract summary: We formalize single hidden layer sparse auto-encoders as a transform learning problem. We propose an optimization problem that leads to a predictive model invariant to the noise level at test time. Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise.
Score: 20.76999663290342
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sparse auto-encoders are useful for extracting low-dimensional representations from high-dimensional data. However, their performance degrades sharply when the input noise at test time differs from the noise employed during training. This limitation hinders the applicability of auto-encoders in real-world scenarios where the level of noise in the input is unpredictable. In this paper, we formalize single hidden layer sparse auto-encoders as a transform learning problem. Leveraging the transform modeling interpretation, we propose an optimization problem that leads to a predictive model invariant to the noise level at test time. In other words, the same pre-trained model is able to generalize to different noise levels. The proposed optimization algorithm, derived from the square root lasso, is translated into a new, computationally efficient auto-encoding architecture. After proving that our new method is invariant to the noise level, we evaluate our approach by training networks using the proposed architecture for denoising tasks. Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise compared to commonly used architectures.

Related papers

Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets. We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC) NPC consists of a detection module and a correction module. We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z)
Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework. We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels. Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z)
Walking Noise: On Layer-Specific Robustness of Neural Architectures against Noisy Computations and Associated Characteristic Learning Dynamics [1.5184189132709105]
We discuss the implications of additive, multiplicative and mixed noise for different classification tasks and model architectures. We propose a methodology called Walking Noise which injects layer-specific noise to measure the robustness. We conclude with a discussion of the use of this methodology in practice, among others, discussing its use for tailored multi-execution in noisy environments.
arXiv Detail & Related papers (2022-12-20T17:09:08Z)
Improving the Robustness of Summarization Models by Detecting and Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes. We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z)
Robust Time Series Denoising with Learnable Wavelet Packet Transform [1.370633147306388]
In many applications, signal denoising is often the first pre-processing step before any subsequent analysis or learning task. We propose to apply a deep learning denoising model inspired by a signal processing, a learnable version of wavelet packet transform. We demonstrate how the proposed algorithm relates to the universality of signal processing methods and the learning capabilities of deep learning approaches.
arXiv Detail & Related papers (2022-06-13T13:05:58Z)
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR) Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model. We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
CDLNet: Noise-Adaptive Convolutional Dictionary Learning Network for Blind Denoising and Demosaicing [4.975707665155918]
Unrolled optimization networks present an interpretable alternative to constructing deep neural networks. We propose an unrolled convolutional dictionary learning network (CDLNet) and demonstrate its competitive denoising and demosaicing (JDD) performance. Specifically, we show that the proposed model outperforms state-of-the-art fully convolutional denoising and JDD models when scaled to a similar parameter count.
arXiv Detail & Related papers (2021-12-02T01:23:21Z)
Diffusion-Based Representation Learning [65.55681678004038]
We augment the denoising score matching framework to enable representation learning without any supervised signal. In contrast, the introduced diffusion-based representation learning relies on a new formulation of the denoising score matching objective. Using the same approach, we propose to learn an infinite-dimensional latent code that achieves improvements of state-of-the-art models on semi-supervised image classification.
arXiv Detail & Related papers (2021-05-29T09:26:02Z)
CDLNet: Robust and Interpretable Denoising Through Deep Convolutional Dictionary Learning [6.6234935958112295]
Unrolled optimization networks propose an interpretable alternative to constructing deep neural networks. We show that the proposed model outperforms the state-of-the-art denoising models when scaled to similar parameter count.
arXiv Detail & Related papers (2021-03-05T01:15:59Z)
Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement [26.596930749375474]
We introduce the use of a latent sequential variable with Markovian dependencies to switch between different VAE architectures through time. We derive the corresponding variational expectation-maximization algorithm to estimate the parameters of the model and enhance the speech signal.
arXiv Detail & Related papers (2021-02-08T11:45:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.