Related papers: Towards Robust Multimodal Prompting With Missing Modalities

Towards Robust Multimodal Prompting With Missing Modalities

URL: http://arxiv.org/abs/2312.15890v2
Date: Wed, 27 Dec 2023 03:41:58 GMT
Title: Towards Robust Multimodal Prompting With Missing Modalities
Authors: Jaehyuk Jang, Yooseung Wang, Changick Kim
Abstract summary: multimodal prompting introduces learnable missing-aware prompts for all missing modality cases. It lacks robustness in scenarios with different missing modality settings between training and inference. We propose a simple yet effective prompt design to address these challenges.
Score: 22.176372579439356
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, multimodal prompting, which introduces learnable missing-aware prompts for all missing modality cases, has exhibited impressive performance. However, it encounters two critical issues: 1) The number of prompts grows exponentially as the number of modalities increases; and 2) It lacks robustness in scenarios with different missing modality settings between training and inference. In this paper, we propose a simple yet effective prompt design to address these challenges. Instead of using missing-aware prompts, we utilize prompts as modality-specific tokens, enabling them to capture the unique characteristics of each modality. Furthermore, our prompt design leverages orthogonality between prompts as a key element to learn distinct information across different modalities and promote diversity in the learned representations. Extensive experiments demonstrate that our prompt design enhances both performance and robustness while reducing the number of prompts.

Related papers

Eliciting Textual Descriptions from Representations of Continuous Prompts [11.489611613744724]
We propose a new approach to interpret continuous prompts that elicits textual descriptions from their representations during model inference. We show our method often yields accurate task descriptions which become more faithful as task performance increases. InSPEcT can be leveraged to debug unwanted properties in continuous prompts and inform developers on ways to mitigate them.
arXiv Detail & Related papers (2024-10-15T14:46:11Z)
MuAP: Multi-step Adaptive Prompt Learning for Vision-Language Model with Missing Modality [11.03329286331929]
We present the first comprehensive investigation into prompt learning behavior when modalities are incomplete. We propose a novel Multi-step Adaptive Prompt Learning framework, aiming to generate multimodal prompts and perform multi-step prompt tuning.
arXiv Detail & Related papers (2024-09-07T03:33:46Z)
Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities. Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts. Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z)
Tuning Multi-mode Token-level Prompt Alignment across Modalities [48.39511580746271]
We propose a multi-mode token-level tuning framework to learn and align a set of prompt tokens across modalities. Specifically, we rely on two essential factors: 1) multi-mode prompts discovery, which guarantees diverse semantic representations, and 2) token-level alignment, which helps explore fine-grained similarity. Experiments on popular image recognition benchmarks show the superior generalization and few-shot abilities of our approach.
arXiv Detail & Related papers (2023-09-25T03:20:09Z)
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding [51.48361798508375]
We develop an information-theoretic framework that formulates soft prompt tuning as maximizing mutual information between prompts and other model parameters. We show that InfoPrompt can significantly accelerate the convergence of the prompt tuning and outperform traditional prompt tuning methods.
arXiv Detail & Related papers (2023-06-08T04:31:48Z)
Multimodal Prompting with Missing Modalities for Visual Recognition [40.961534960897595]
We tackle two challenges in multimodal learning for visual recognition: 1) when missing-modality occurs during training or testing in real-world situations; and 2) when computation resources are not available to finetune on heavy transformer models. Specifically, our modality-missing-aware prompts can be plugged into multimodal transformers to handle general missing-modality cases, while only requiring less than 1% learnable parameters compared to training the entire model.
arXiv Detail & Related papers (2023-03-06T18:54:46Z)
Demystifying Prompts in Language Models via Perplexity Estimation [109.59105230163041]
Performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. We show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task.
arXiv Detail & Related papers (2022-12-08T02:21:47Z)
MetaPrompting: Learning to Learn Better Prompts [52.914694884515534]
We propose a new soft prompting method called MetaPrompting, which adopts the well-recognized model-agnostic meta-learning algorithm. Extensive experiments show MetaPrompting brings significant improvement on four different datasets.
arXiv Detail & Related papers (2022-09-23T09:01:05Z)
Instance-aware Prompt Learning for Language Understanding and Generation [49.22899822734549]
We propose an instance-aware prompt learning method that learns a different prompt for each instance. Our method achieves the state-of-the-art on the SuperGLUE few-shot learning benchmark.
arXiv Detail & Related papers (2022-01-18T17:03:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.