Classifier-guided Gradient Modulation for Enhanced Multimodal Learning
- URL: http://arxiv.org/abs/2411.01409v1
- Date: Sun, 03 Nov 2024 02:38:43 GMT
- Title: Classifier-guided Gradient Modulation for Enhanced Multimodal Learning
- Authors: Zirun Guo, Tao Jin, Jingyuan Chen, Zhou Zhao,
- Abstract summary: Gradient-Guided Modulation (CGGM) is a novel method to balance multimodal learning with gradients.
We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS.
CGGM outperforms all the baselines and other state-of-the-art methods consistently.
- Score: 50.7008456698935
- License:
- Abstract: Multimodal learning has developed very fast in recent years. However, during the multimodal training process, the model tends to rely on only one modality based on which it could learn faster, thus leading to inadequate use of other modalities. Existing methods to balance the training process always have some limitations on the loss functions, optimizers and the number of modalities and only consider modulating the magnitude of the gradients while ignoring the directions of the gradients. To solve these problems, in this paper, we present a novel method to balance multimodal learning with Classifier-Guided Gradient Modulation (CGGM), considering both the magnitude and directions of the gradients. We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS 2021, covering classification, regression and segmentation tasks. The results show that CGGM outperforms all the baselines and other state-of-the-art methods consistently, demonstrating its effectiveness and versatility. Our code is available at https://github.com/zrguo/CGGM.
Related papers
- ReconBoost: Boosting Can Achieve Modality Reconcilement [89.4377895465204]
We study the modality-alternating learning paradigm to achieve reconcilement.
We propose a new method called ReconBoost to update a fixed modality each time.
We show that the proposed method resembles Friedman's Gradient-Boosting (GB) algorithm, where the updated learner can correct errors made by others.
arXiv Detail & Related papers (2024-05-15T13:22:39Z) - Gradient-Guided Modality Decoupling for Missing-Modality Robustness [24.95911972867697]
We introduce a novel indicator, gradients, to monitor and reduce modality dominance.
We present a novel Gradient-guided Modality Decoupling (GMD) method to decouple the dependency on dominating modalities.
In addition, to flexibly handle modal-incomplete data, we design a parameter-efficient Dynamic Sharing framework.
arXiv Detail & Related papers (2024-02-26T05:50:43Z) - Multimodal Representation Learning by Alternating Unimodal Adaptation [73.15829571740866]
We propose MLA (Multimodal Learning with Alternating Unimodal Adaptation) to overcome challenges where some modalities appear more dominant than others during multimodal learning.
MLA reframes the conventional joint multimodal learning process by transforming it into an alternating unimodal learning process.
It captures cross-modal interactions through a shared head, which undergoes continuous optimization across different modalities.
Experiments are conducted on five diverse datasets, encompassing scenarios with complete modalities and scenarios with missing modalities.
arXiv Detail & Related papers (2023-11-17T18:57:40Z) - Improving Discriminative Multi-Modal Learning with Large-Scale
Pre-Trained Models [51.5543321122664]
This paper investigates how to better leverage large-scale pre-trained uni-modal models to enhance discriminative multi-modal learning.
We introduce Multi-Modal Low-Rank Adaptation learning (MMLoRA)
arXiv Detail & Related papers (2023-10-08T15:01:54Z) - Scaling Multimodal Pre-Training via Cross-Modality Gradient
Harmonization [68.49738668084693]
Self-supervised pre-training recently demonstrates success on large-scale multimodal data.
Cross-modality alignment (CMA) is only a weak and noisy supervision.
CMA might cause conflicts and biases among modalities.
arXiv Detail & Related papers (2022-11-03T18:12:32Z) - Balanced Multimodal Learning via On-the-fly Gradient Modulation [10.5602074277814]
Multimodal learning helps to comprehensively understand the world, by integrating different senses.
We propose on-the-fly gradient modulation to adaptively control the optimization of each modality, via monitoring the discrepancy of their contribution towards the learning objective.
arXiv Detail & Related papers (2022-03-29T08:26:38Z) - Contextual Gradient Scaling for Few-Shot Learning [24.19934081878197]
We propose contextual gradient scaling (CxGrad) for model-agnostic meta-learning (MAML)
CxGrad scales gradient norms of the backbone to facilitate learning task-specific knowledge in the inner-loop.
Experimental results show that CxGrad effectively encourages the backbone to learn task-specific knowledge in the inner-loop.
arXiv Detail & Related papers (2021-10-20T03:05:58Z) - Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in
Multi-Domain Loss Landscapes by Inner-Loop Learning [5.490618192331097]
We consider a model-agnostic solution to the problem of Multi-Domain Learning for multi-modal applications.
Our method is model-agnostic, requiring no additional model parameters and no network architecture changes.
We demonstrate our solution to a fitting problem in medical imaging, specifically, in the automatic segmentation of white matter hyperintensity.
arXiv Detail & Related papers (2021-02-25T19:54:44Z) - Regularizing Meta-Learning via Gradient Dropout [102.29924160341572]
meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize.
We introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning.
arXiv Detail & Related papers (2020-04-13T10:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.