Gradient-Guided Modality Decoupling for Missing-Modality Robustness
- URL: http://arxiv.org/abs/2402.16318v1
- Date: Mon, 26 Feb 2024 05:50:43 GMT
- Title: Gradient-Guided Modality Decoupling for Missing-Modality Robustness
- Authors: Hao Wang, Shengda Luo, Guosheng Hu and Jianguo Zhang
- Abstract summary: We introduce a novel indicator, gradients, to monitor and reduce modality dominance.
We present a novel Gradient-guided Modality Decoupling (GMD) method to decouple the dependency on dominating modalities.
In addition, to flexibly handle modal-incomplete data, we design a parameter-efficient Dynamic Sharing framework.
- Score: 24.95911972867697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal learning with incomplete input data (missing modality) is
practical and challenging. In this work, we conduct an in-depth analysis of
this challenge and find that modality dominance has a significant negative
impact on the model training, greatly degrading the missing modality
performance. Motivated by Grad-CAM, we introduce a novel indicator, gradients,
to monitor and reduce modality dominance which widely exists in the
missing-modality scenario. In aid of this indicator, we present a novel
Gradient-guided Modality Decoupling (GMD) method to decouple the dependency on
dominating modalities. Specifically, GMD removes the conflicted gradient
components from different modalities to achieve this decoupling, significantly
improving the performance. In addition, to flexibly handle modal-incomplete
data, we design a parameter-efficient Dynamic Sharing (DS) framework which can
adaptively switch on/off the network parameters based on whether one modality
is available. We conduct extensive experiments on three popular multimodal
benchmarks, including BraTS 2018 for medical segmentation, CMU-MOSI, and
CMU-MOSEI for sentiment analysis. The results show that our method can
significantly outperform the competitors, showing the effectiveness of the
proposed solutions. Our code is released here:
https://github.com/HaoWang420/Gradient-guided-Modality-Decoupling.
Related papers
- Classifier-guided Gradient Modulation for Enhanced Multimodal Learning [50.7008456698935]
Gradient-Guided Modulation (CGGM) is a novel method to balance multimodal learning with gradients.
We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS.
CGGM outperforms all the baselines and other state-of-the-art methods consistently.
arXiv Detail & Related papers (2024-11-03T02:38:43Z) - Mind the Gap: Promoting Missing Modality Brain Tumor Segmentation with Alignment [21.571977754383518]
Brain tumor segmentation is often based on multiple magnetic resonance imaging (MRI)
In clinical practice, certain modalities of MRI may be missing, which presents an even more difficult scenario.
We propose a novel paradigm that aligns latent features of involved modalities to a well-defined distribution anchor.
arXiv Detail & Related papers (2024-09-28T14:37:42Z) - MedMAP: Promoting Incomplete Multi-modal Brain Tumor Segmentation with Alignment [20.358300924109162]
In clinical practice, certain modalities of MRI may be missing, which presents a more difficult scenario.
Knowledge Distillation, Domain Adaption, and Shared Latent Space have emerged as commonly promising strategies.
We propose a novel paradigm that aligns latent features of involved modalities to a well-defined distribution anchor as the substitution of the pre-trained model.
arXiv Detail & Related papers (2024-08-18T13:16:30Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - Exploring Missing Modality in Multimodal Egocentric Datasets [89.76463983679058]
We introduce a novel concept -Missing Modality Token (MMT)-to maintain performance even when modalities are absent.
Our method mitigates the performance loss, reducing it from its original $sim 30%$ drop to only $sim 10%$ when half of the test set is modal-incomplete.
arXiv Detail & Related papers (2024-01-21T11:55:42Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Learning Progressive Modality-shared Transformers for Effective
Visible-Infrared Person Re-identification [27.75907274034702]
We propose a novel deep learning framework named Progressive Modality-shared Transformer (PMT) for effective VI-ReID.
To reduce the negative effect of modality gaps, we first take the gray-scale images as an auxiliary modality and propose a progressive learning strategy.
To cope with the problem of large intra-class differences and small inter-class differences, we propose a Discriminative Center Loss.
arXiv Detail & Related papers (2022-12-01T02:20:16Z) - Exploiting modality-invariant feature for robust multimodal emotion
recognition with missing modalities [76.08541852988536]
We propose to use invariant features for a missing modality imagination network (IF-MMIN)
We show that the proposed model outperforms all baselines and invariantly improves the overall emotion recognition performance under uncertain missing-modality conditions.
arXiv Detail & Related papers (2022-10-27T12:16:25Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.