Related papers: Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum

Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum

URL: http://arxiv.org/abs/2505.12191v1
Date: Sun, 18 May 2025 01:37:58 GMT
Title: Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum
Authors: Wenquan Lu, Jiaqi Zhang, Hugues Van Assel, Randall Balestriero,
Abstract summary: Self-Supervised Learning (SSL) has become a powerful solution to extract rich representations from unlabeled data.<n>Applying SSL on noisy data remains a challenge, despite being crucial to applications such as astrophysics, medical imaging, geophysics or finance.<n>We present a fully self-supervised framework that enables noise-robust representation learning without requiring a denoiser at inference or downstream fine-tuning.
Score: 15.31692175683912
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-Supervised Learning (SSL) has become a powerful solution to extract rich representations from unlabeled data. Yet, SSL research is mostly focused on clean, curated and high-quality datasets. As a result, applying SSL on noisy data remains a challenge, despite being crucial to applications such as astrophysics, medical imaging, geophysics or finance. In this work, we present a fully self-supervised framework that enables noise-robust representation learning without requiring a denoiser at inference or downstream fine-tuning. Our method first trains an SSL denoiser on noisy data, then uses it to construct a denoised-to-noisy data curriculum (i.e., training first on denoised, then noisy samples) for pretraining a SSL backbone (e.g., DINOv2), combined with a teacher-guided regularization that anchors noisy embeddings to their denoised counterparts. This process encourages the model to internalize noise robustness. Notably, the denoiser can be discarded after pretraining, simplifying deployment. On ImageNet-1k with ViT-B under extreme Gaussian noise ($\sigma=255$, SNR = 0.72 dB), our method improves linear probing accuracy by 4.8% over DINOv2, demonstrating that denoiser-free robustness can emerge from noise-aware pretraining. The code is available at https://github.com/wenquanlu/noisy_dinov2.

Related papers

Self-Calibrated Variance-Stabilizing Transformations for Real-World Image Denoising [19.08732222562782]
Supervised deep learning has become the method of choice for image denoising. We show that, contrary to popular belief, denoising networks specialized in the removal of Gaussian noise can be efficiently leveraged in favor of real-world image denoising.
arXiv Detail & Related papers (2024-07-24T16:23:46Z)
Denoising-Aware Contrastive Learning for Noisy Time Series [35.97130925600067]
Time series self-supervised learning (SSL) aims to exploit unlabeled data for pre-training to mitigate the reliance on labels. We propose denoising-aware contrastive learning (DECL) to mitigate the noise in the representation and automatically selects suitable denoising methods for every sample.
arXiv Detail & Related papers (2024-06-07T04:27:32Z)
Impact of Noisy Supervision in Foundation Model Learning [91.56591923244943]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.<n>We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks. We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z)
Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework. We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels. Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z)
Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT) Most de-noising methods fail to identify the hard noises. We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z)
Hard Sample Aware Noise Robust Learning for Histopathology Image Classification [4.75542005200538]
We introduce a novel hard sample aware noise robust learning method for histopathology image classification. To distinguish the informative hard samples from the harmful noisy ones, we build an easy/hard/noisy (EHN) detection model. We propose a noise suppressing and hard enhancing (NSHE) scheme to train the noise robust model.
arXiv Detail & Related papers (2021-12-05T11:07:55Z)
IDR: Self-Supervised Image Denoising via Iterative Data Refinement [66.5510583957863]
We present a practical unsupervised image denoising method to achieve state-of-the-art denoising performance. Our method only requires single noisy images and a noise model, which is easily accessible in practical raw image denoising. To evaluate raw image denoising performance in real-world applications, we build a high-quality raw image dataset SenseNoise-500 that contains 500 real-life scenes.
arXiv Detail & Related papers (2021-11-29T07:22:53Z)
Unpaired Learning of Deep Image Denoising [80.34135728841382]
This paper presents a two-stage scheme by incorporating self-supervised learning and knowledge distillation. For self-supervised learning, we suggest a dilated blind-spot network (D-BSN) to learn denoising solely from real noisy images. Experiments show that our unpaired learning method performs favorably on both synthetic noisy images and real-world noisy photographs.
arXiv Detail & Related papers (2020-08-31T16:22:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.