Related papers: Can Out-of-Domain data help to Learn Domain-Specific Prompts for Multimodal Misinformation Detection?

Can Out-of-Domain data help to Learn Domain-Specific Prompts for Multimodal Misinformation Detection?

URL: http://arxiv.org/abs/2311.16496v4
Date: Tue, 07 Jan 2025 03:08:05 GMT
Title: Can Out-of-Domain data help to Learn Domain-Specific Prompts for Multimodal Misinformation Detection?
Authors: Amartya Bhattacharya, Debarshi Brahma, Suraj Nagaje Mahadev, Anmol Asati, Vikas Verma, Soma Biswas,
Abstract summary: Domain-specific Prompt tuning can exploit out-of-domain data during training to improve fake news detection of all desired domains simultaneously.<n>Experiments on the large-scale NewsCLIPpings and VERITE benchmarks demonstrate that DPOD achieves state-the-art performance for this challenging task.
Score: 14.722270908687216
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spread of fake news using out-of-context images and captions has become widespread in this era of information overload. Since fake news can belong to different domains like politics, sports, etc. with their unique characteristics, inference on a test image-caption pair is contingent on how well the model has been trained on similar data. Since training individual models for each domain is not practical, we propose a novel framework termed DPOD (Domain-specific Prompt tuning using Out-of-domain data), which can exploit out-of-domain data during training to improve fake news detection of all desired domains simultaneously. First, to compute generalizable features, we modify the Vision-Language Model, CLIP to extract features that helps to align the representations of the images and corresponding captions of both the in-domain and out-of-domain data in a label-aware manner. Further, we propose a domain-specific prompt learning technique which leverages training samples of all the available domains based on the extent they can be useful to the desired domain. Extensive experiments on the large-scale NewsCLIPpings and VERITE benchmarks demonstrate that DPOD achieves state of-the-art performance for this challenging task. Code: https://github.com/scviab/DPOD.

Related papers

A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation [52.0964459842176]
Current state-of-the-art dialogue systems heavily rely on extensive training datasets. We propose a novel data textbfAugmentation framework for textbfMulti-textbfDomain textbfDialogue textbfGeneration, referred to as textbfAMD$2$G. The AMD$2$G framework consists of a data augmentation process and a two-stage training approach: domain-agnostic training and domain adaptation training.
arXiv Detail & Related papers (2024-06-14T09:52:27Z)
Learning Domain-Invariant Features for Out-of-Context News Detection [19.335065976085982]
Out-of-context news is a common type of misinformation on online media platforms. In this work, we focus on domain adaptive out-of-context news detection. We propose ConDA-TTA which applies contrastive learning and maximum mean discrepancy (MMD) to learn domain-invariant features.
arXiv Detail & Related papers (2024-06-11T16:34:02Z)
Prompt-based Visual Alignment for Zero-shot Policy Transfer [35.784936617675896]
Overfitting in reinforcement learning has become one of the main obstacles to applications in reinforcement learning. We propose prompt-based visual alignment (PVA) to mitigate the detrimental domain bias in the image for zero-shot policy transfer. We verify PVA on a vision-based autonomous driving task with CARLA simulator.
arXiv Detail & Related papers (2024-06-05T13:26:30Z)
WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization [63.98650220772378]
We present WIDIn, Wording Images for Domain-Invariant representation, to disentangle discriminative visual representation. We first estimate the language embedding with fine-grained alignment, which can be used to adaptively identify and then remove domain-specific counterpart. We show that WIDIn can be applied to both pretrained vision-language models like CLIP, and separately trained uni-modal models like MoCo and BERT.
arXiv Detail & Related papers (2024-05-28T17:46:27Z)
Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation [19.944946262284123]
Humans can easily extrapolate novel domains, thus, an intriguing question arises: How can neural networks extrapolate like humans and achieve OOD generalization? We introduce a novel approach to domain extrapolation that leverages reasoning ability and the extensive knowledge encapsulated within large language models (LLMs) to synthesize entirely new domains. Our methods exhibit commendable performance in this setting, even surpassing the supervised setting by approximately 1-2% on datasets such as VLCS.
arXiv Detail & Related papers (2024-03-08T18:44:23Z)
Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains. This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training. We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z)
Robust Domain Misinformation Detection via Multi-modal Feature Alignment [49.89164555394584]
We propose a robust domain and cross-modal approach for multi-modal misinformation detection. It reduces the domain shift by aligning the joint distribution of textual and visual modalities. We also propose a framework that simultaneously considers application scenarios of domain generalization.
arXiv Detail & Related papers (2023-11-24T07:06:16Z)
Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration [64.58185031596169]
Explore-Instruct is a novel approach to enhance the data coverage to be used in domain-specific instruction-tuning. Our data-centric analysis validates the effectiveness of this proposed approach in improving domain-specific instruction coverage. Our findings offer a promising opportunity to improve instruction coverage, especially in domain-specific contexts.
arXiv Detail & Related papers (2023-10-13T15:03:15Z)
Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance [15.513912470752041]
The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains.
arXiv Detail & Related papers (2023-10-02T06:08:01Z)
Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms. We propose a textbfDomain-Controlled Prompt Learning for the specific domains. Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z)
Using Language to Extend to Unseen Domains [81.37175826824625]
It is expensive to collect training data for every possible domain that a vision model may encounter when deployed. We consider how simply verbalizing the training domain as well as domains we want to extend to but do not have data for can improve robustness. Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain.
arXiv Detail & Related papers (2022-10-18T01:14:02Z)
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval [55.122020263319634]
Video moment retrieval (VMR) aims to localize the target moment from an untrimmed video according to a given language query. In this paper, we focus on a novel task: cross-domain VMR, where fully-annotated datasets are available in one domain but the domain of interest only contains unannotated datasets. We propose a novel Multi-Modal Cross-Domain Alignment network to transfer the annotation knowledge from the source domain to the target domain.
arXiv Detail & Related papers (2022-09-23T12:58:20Z)
Improving Fake News Detection of Influential Domain via Domain- and Instance-Level Transfer [16.886024206337257]
We propose a Domain- and Instance-level Transfer Framework for Fake News Detection (DITFEND) DITFEND could improve the performance of specific target domains. Online experiments show that it brings additional improvements over the base models in a real-world scenario.
arXiv Detail & Related papers (2022-09-19T10:21:13Z)
Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting [75.80116276369694]
In crowd counting, due to the problem of laborious labelling, it is perceived intractability of collecting a new large-scale dataset. We resort to the multi-domain joint learning and propose a simple but effective Domain-specific Knowledge Propagating Network (DKPNet) It is mainly achieved by proposing the novel Variational Attention(VA) technique for explicitly modeling the attention distributions for different domains.
arXiv Detail & Related papers (2021-08-18T08:06:37Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)
Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available. This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets. We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z)
Batch Normalization Embeddings for Deep Domain Generalization [50.51405390150066]
Domain generalization aims at training machine learning models to perform robustly across different and unseen domains. We show a significant increase in classification accuracy over current state-of-the-art techniques on popular domain generalization benchmarks.
arXiv Detail & Related papers (2020-11-25T12:02:57Z)
Domain Generalized Person Re-Identification via Cross-Domain Episodic Learning [31.17248105464821]
We present an episodic learning scheme which advances meta learning strategies to exploit the observed source-domain labeled data. Our experiments on four benchmark datasets confirm the superiority of our method over the state-of-the-arts.
arXiv Detail & Related papers (2020-10-19T14:42:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.