Related papers: Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification

Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification

URL: http://arxiv.org/abs/2601.07163v1
Date: Mon, 12 Jan 2026 03:14:12 GMT
Title: Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification
Authors: Shu Shen, C. L. Philip Chen, Tong Zhang,
Abstract summary: We propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD) for reliable learning on multimodal data.<n>The proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.
Score: 55.56234913868664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reliable learning on low-quality multimodal data is a widely concerning issue, especially in safety-critical applications. However, multimodal noise poses a major challenge in this domain and leads existing methods to suffer from two key limitations. First, they struggle to reliably remove heterogeneous data noise, hindering robust multimodal representation learning. Second, they exhibit limited adaptability and generalization when encountering previously unseen noise. To address these issues, we propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD). On one hand, TAHCD introduces the Adaptive Stable Subspace Alignment and Sample-Adaptive Confidence Alignment to reliably remove heterogeneous noise. They account for noise at both global and instance levels and enable jointly removal of modality-specific and cross-modality noise, achieving robust learning. On the other hand, TAHCD introduces test-time cooperative enhancement, which adaptively updates the model in response to input noise in a label-free manner, improving adaptability and generalization. This is achieved by collaboratively enhancing the joint removal process of modality-specific and cross-modality noise across global and instance levels according to sample noise. Experiments on multiple benchmarks demonstrate that the proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.

Related papers

Quality-Aware Robust Multi-View Clustering for Heterogeneous Observation Noise [12.720216418233795]
We propose a novel framework termed Quality-Aware Robust Multi-View Clustering (QARMVC)<n>QARMVC employs an information bottleneck mechanism to extract intrinsic semantics for view reconstruction.<n>In experiments on five benchmark datasets, QARMVC consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2026-02-26T03:16:44Z)
Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance [54.88271057438763]
Noise Awareness Guidance (NAG) is a correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule.<n>NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.
arXiv Detail & Related papers (2025-10-14T13:31:34Z)
Dual-granularity Sinkhorn Distillation for Enhanced Learning from Long-tailed Noisy Data [67.25796812343454]
Real-world datasets for deep learning frequently suffer from the co-occurring challenges of class imbalance and label noise.<n>We propose Dual-granularity Sinkhorn Distillation (D-SINK), a novel framework that enhances dual robustness by distilling and integrating complementary insights.<n>Experiments on benchmark datasets demonstrate that D-SINK significantly improves robustness and achieves strong empirical performance in learning from long-tailed noisy data.
arXiv Detail & Related papers (2025-10-09T13:05:27Z)
Modality-Specific Speech Enhancement and Noise-Adaptive Fusion for Acoustic and Body-Conduction Microphone Framework [0.0]
Body-conduction microphone signals (BMS) bypass airborne sound, providing strong noise resistance.<n>We propose a novel multi-modal framework that combines BMS and acoustic microphone signals (AMS) to achieve both noise suppression and high-frequency reconstruction.
arXiv Detail & Related papers (2025-08-24T12:45:34Z)
MICINet: Multi-Level Inter-Class Confusing Information Removal for Reliable Multimodal Classification [57.08108545219043]
A reliable multimodal classification method dubbed Multi-Level Inter-Class Confusing Information Removal Network (MICINet) is proposed.<n>MICINet achieves the reliable removal of both types of noise by unifying them into the concept of Inter-class Confusing Information (textitICI) and eliminating it at both global and individual levels.<n>Experiments on four datasets demonstrate that MICINet outperforms other state-of-the-art reliable multimodal classification methods under various noise conditions.
arXiv Detail & Related papers (2025-02-27T01:33:28Z)
Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation [91.83820250747935]
Pseudo-label noise is mainly contained in unstable samples in which predictions of most pixels undergo significant variations during self-training. We introduce the Stable Neighbor Denoising (SND) approach, which effectively discovers highly correlated stable and unstable samples. SND consistently outperforms state-of-the-art methods in various SFUDA semantic segmentation settings.
arXiv Detail & Related papers (2024-06-10T21:44:52Z)
Trusted Multi-view Learning under Noisy Supervision [20.668620759102115]
We propose a method to develop a reliable multi-view learning model under the guidance of noisy labels.<n>TMNR employs evidential deep neural networks to construct view-specific opinions that capture both beliefs and uncertainty.<n>TMNR2 identifies potentially mislabeled samples through evidence-label consistency and generates pseudo-labels from neighboring information.
arXiv Detail & Related papers (2024-04-18T06:47:30Z)
Effective Causal Discovery under Identifiable Heteroscedastic Noise Model [45.98718860540588]
Causal DAG learning has recently achieved promising performance in terms of both accuracy and efficiency. We propose a novel formulation for DAG learning that accounts for the variation in noise variance across variables and observations. We then propose an effective two-phase iterative DAG learning algorithm to address the increasing optimization difficulties.
arXiv Detail & Related papers (2023-12-20T08:51:58Z)
Improve Noise Tolerance of Robust Loss via Noise-Awareness [60.34670515595074]
We propose a meta-learning method which is capable of adaptively learning a hyper parameter prediction function, called Noise-Aware-Robust-Loss-Adjuster (NARL-Adjuster for brevity) Four SOTA robust loss functions are attempted to be integrated with our algorithm, and comprehensive experiments substantiate the general availability and effectiveness of the proposed method in both its noise tolerance and performance.
arXiv Detail & Related papers (2023-01-18T04:54:58Z)
Noise-Tolerant Learning for Audio-Visual Action Recognition [31.641972732424463]
Video datasets are usually coarse-annotated or collected from the Internet. We propose a noise-tolerant learning framework to find anti-interference model parameters against both noisy labels and noisy correspondence. Our method significantly improves the robustness of the action recognition model and surpasses the baselines by a clear margin.
arXiv Detail & Related papers (2022-05-16T12:14:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.