MMANet: Margin-aware Distillation and Modality-aware Regularization for
Incomplete Multimodal Learning
- URL: http://arxiv.org/abs/2304.08028v1
- Date: Mon, 17 Apr 2023 07:22:15 GMT
- Title: MMANet: Margin-aware Distillation and Modality-aware Regularization for
Incomplete Multimodal Learning
- Authors: Shicai Wei, Yang Luo, Chunbo Luo
- Abstract summary: MMANet is a framework to assist incomplete multimodal learning.
It consists of three components: the deployment network used for inference, the teacher network transferring comprehensive multimodal information, and the regularization network guiding the deployment network to balance weak modality combinations.
- Score: 4.647741695828225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal learning has shown great potentials in numerous scenes and
attracts increasing interest recently. However, it often encounters the problem
of missing modality data and thus suffers severe performance degradation in
practice. To this end, we propose a general framework called MMANet to assist
incomplete multimodal learning. It consists of three components: the deployment
network used for inference, the teacher network transferring comprehensive
multimodal information to the deployment network, and the regularization
network guiding the deployment network to balance weak modality combinations.
Specifically, we propose a novel margin-aware distillation (MAD) to assist the
information transfer by weighing the sample contribution with the
classification uncertainty. This encourages the deployment network to focus on
the samples near decision boundaries and acquire the refined inter-class
margin. Besides, we design a modality-aware regularization (MAR) algorithm to
mine the weak modality combinations and guide the regularization network to
calculate prediction loss for them. This forces the deployment network to
improve its representation ability for the weak modality combinations
adaptively. Finally, extensive experiments on multimodal classification and
segmentation tasks demonstrate that our MMANet outperforms the state-of-the-art
significantly. Code is available at: https://github.com/shicaiwei123/MMANet
Related papers
- Robust Multimodal Learning via Representation Decoupling [6.7678581401558295]
Multimodal learning has attracted increasing attention due to its practicality.
Existing methods tend to address it by learning a common subspace representation for different modality combinations.
We propose a novel Decoupled Multimodal Representation Network (DMRNet) to assist robust multimodal learning.
arXiv Detail & Related papers (2024-07-05T12:09:33Z) - Continual Learning: Forget-free Winning Subnetworks for Video Representations [75.40220771931132]
Winning Subnetwork (WSN) in terms of task performance is considered for various continual learning tasks.
It leverages pre-existing weights from dense networks to achieve efficient learning in Task Incremental Learning (TIL) and Task-agnostic Incremental Learning (TaIL) scenarios.
The use of Fourier Subneural Operator (FSO) within WSN is considered for Video Incremental Learning (VIL)
arXiv Detail & Related papers (2023-12-19T09:11:49Z) - Cross-head mutual Mean-Teaching for semi-supervised medical image
segmentation [6.738522094694818]
Semi-supervised medical image segmentation (SSMIS) has witnessed substantial advancements by leveraging limited labeled data and abundant unlabeled data.
Existing state-of-the-art (SOTA) methods encounter challenges in accurately predicting labels for the unlabeled data.
We propose a novel Cross-head mutual mean-teaching Network (CMMT-Net) incorporated strong-weak data augmentation.
arXiv Detail & Related papers (2023-10-08T09:13:04Z) - Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - On the Soft-Subnetwork for Few-shot Class Incremental Learning [67.0373924836107]
We propose a few-shot class incremental learning (FSCIL) method referred to as emphSoft-SubNetworks (SoftNet).
Our objective is to learn a sequence of sessions incrementally, where each session only includes a few training instances per class while preserving the knowledge of the previously learned ones.
We provide comprehensive empirical validations demonstrating that our SoftNet effectively tackles the few-shot incremental learning problem by surpassing the performance of state-of-the-art baselines over benchmark datasets.
arXiv Detail & Related papers (2022-09-15T04:54:02Z) - Modality Competition: What Makes Joint Training of Multi-modal Network
Fail in Deep Learning? (Provably) [75.38159612828362]
It has been observed that the best uni-modal network outperforms the jointly trained multi-modal network.
This work provides a theoretical explanation for the emergence of such performance gap in neural networks for the prevalent joint training framework.
arXiv Detail & Related papers (2022-03-23T06:21:53Z) - Learning Prototype-oriented Set Representations for Meta-Learning [85.19407183975802]
Learning from set-structured data is a fundamental problem that has recently attracted increasing attention.
This paper provides a novel optimal transport based way to improve existing summary networks.
We further instantiate it to the cases of few-shot classification and implicit meta generative modeling.
arXiv Detail & Related papers (2021-10-18T09:49:05Z) - Meta-Learning with Network Pruning [40.07436648243748]
We propose a network pruning based meta-learning approach for overfitting reduction via explicitly controlling the capacity of network.
We have implemented our approach on top of Reptile assembled with two network pruning routines: Dense-Sparse-Dense (DSD) and Iterative Hard Thresholding (IHT)
arXiv Detail & Related papers (2020-07-07T06:13:11Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.