MID: A Self-supervised Multimodal Iterative Denoising Framework
- URL: http://arxiv.org/abs/2511.00997v1
- Date: Sun, 02 Nov 2025 16:13:52 GMT
- Title: MID: A Self-supervised Multimodal Iterative Denoising Framework
- Authors: Chang Nie, Tianchen Deng, Zhe Liu, Hesheng Wang,
- Abstract summary: Real-world data is frequently corrupted by complex, non-linear noise.<n>We propose a novel self-supervised multimodal iterative denoising framework, MID.<n>Experiments across four classic computer vision tasks demonstrate MID's robustness, adaptability, and consistent state-of-the-art performance.
- Score: 21.9870371385388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data denoising is a persistent challenge across scientific and engineering domains. Real-world data is frequently corrupted by complex, non-linear noise, rendering traditional rule-based denoising methods inadequate. To overcome these obstacles, we propose a novel self-supervised multimodal iterative denoising (MID) framework. MID models the collected noisy data as a state within a continuous process of non-linear noise accumulation. By iteratively introducing further noise, MID learns two neural networks: one to estimate the current noise step and another to predict and subtract the corresponding noise increment. For complex non-linear contamination, MID employs a first-order Taylor expansion to locally linearize the noise process, enabling effective iterative removal. Crucially, MID does not require paired clean-noisy datasets, as it learns noise characteristics directly from the noisy inputs. Experiments across four classic computer vision tasks demonstrate MID's robustness, adaptability, and consistent state-of-the-art performance. Moreover, MID exhibits strong performance and adaptability in tasks within the biomedical and bioinformatics domains.
Related papers
- Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification [55.56234913868664]
We propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD) for reliable learning on multimodal data.<n>The proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.
arXiv Detail & Related papers (2026-01-12T03:14:12Z) - Real Noise Decoupling for Hyperspectral Image Denoising [14.247569090609828]
Hyperspectral image (HSI) denoising is a crucial step in enhancing the quality of HSIs.<n>Noise modeling methods can fit noise distributions to generate synthetic HSIs to train denoising networks.<n>We propose a multi-stage noise-decoupling framework that decomposes complex noise into explicitly modeled and implicitly modeled components.
arXiv Detail & Related papers (2025-11-21T12:23:07Z) - Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance [54.88271057438763]
Noise Awareness Guidance (NAG) is a correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule.<n>NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.
arXiv Detail & Related papers (2025-10-14T13:31:34Z) - Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios [10.57695963534794]
Methods based on VAEs are accompanied by issues of local jitter and global instability.
We introduce a conditional GAN to capture audio control signals and implicitly match the multimodal denoising distribution between the diffusion and denoising steps.
arXiv Detail & Related papers (2024-10-27T07:25:11Z) - Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data Scenarios [23.43319138048058]
Multimodal emotion recognition (MER) in practical scenarios is significantly challenged by the presence of missing or incomplete data.
Traditional methods have often involved discarding data or substituting data segments with zero vectors to approximate these incompletenesses.
We introduce a novel noise-robust MER model that effectively learns robust multimodal joint representations from noisy data.
arXiv Detail & Related papers (2023-09-21T10:49:02Z) - Explainable Artificial Intelligence driven mask design for
self-supervised seismic denoising [0.0]
Self-supervised coherent noise suppression methods require extensive knowledge of the noise statistics.
We propose the use of explainable artificial intelligence approaches to see inside the black box that is the denoising network.
We show that a simple averaging of the Jacobian contributions over a number of randomly selected input pixels, provides an indication of the most effective mask.
arXiv Detail & Related papers (2023-07-13T11:02:55Z) - Realistic Noise Synthesis with Diffusion Models [44.404059914652194]
Deep denoising models require extensive real-world training data, which is challenging to acquire.<n>We propose a novel Realistic Noise Synthesis Diffusor (RNSD) method using diffusion models to address these challenges.
arXiv Detail & Related papers (2023-05-23T12:56:01Z) - A Free Lunch to Person Re-identification: Learning from Automatically
Generated Noisy Tracklets [52.30547023041587]
unsupervised video-based re-identification (re-ID) methods have been proposed to solve the problem of high labor cost required to annotate re-ID datasets.
But their performance is still far lower than the supervised counterparts.
In this paper, we propose to tackle this problem by learning re-ID models from automatically generated person tracklets.
arXiv Detail & Related papers (2022-04-02T16:18:13Z) - C2N: Practical Generative Noise Modeling for Real-World Denoising [53.96391787869974]
We introduce a Clean-to-Noisy image generation framework, namely C2N, to imitate complex real-world noise without using paired examples.
We construct the noise generator in C2N accordingly with each component of real-world noise characteristics to express a wide range of noise accurately.
arXiv Detail & Related papers (2022-02-19T05:53:46Z) - Removing Noise from Extracellular Neural Recordings Using Fully
Convolutional Denoising Autoencoders [62.997667081978825]
We propose a Fully Convolutional Denoising Autoencoder, which learns to produce a clean neuronal activity signal from a noisy multichannel input.
The experimental results on simulated data show that our proposed method can improve significantly the quality of noise-corrupted neural signals.
arXiv Detail & Related papers (2021-09-18T14:51:24Z) - Adaptive noise imitation for image denoising [58.21456707617451]
We develop a new textbfadaptive noise imitation (ADANI) algorithm that can synthesize noisy data from naturally noisy images.
To produce realistic noise, a noise generator takes unpaired noisy/clean images as input, where the noisy image is a guide for noise generation.
Coupling the noisy data output from ADANI with the corresponding ground-truth, a denoising CNN is then trained in a fully-supervised manner.
arXiv Detail & Related papers (2020-11-30T02:49:36Z) - Learning Model-Blind Temporal Denoisers without Ground Truths [46.778450578529814]
Denoisers trained with synthetic data often fail to cope with the diversity of unknown noises.
Previous image-based method leads to noise overfitting if directly applied to video denoisers.
We propose a general framework for video denoising networks that successfully addresses these challenges.
arXiv Detail & Related papers (2020-07-07T07:19:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.