Related papers: CAMO: Causality-Guided Adversarial Multimodal Domain Generalization for Crisis Classification

CAMO: Causality-Guided Adversarial Multimodal Domain Generalization for Crisis Classification

URL: http://arxiv.org/abs/2512.08071v1
Date: Mon, 08 Dec 2025 22:12:27 GMT
Title: CAMO: Causality-Guided Adversarial Multimodal Domain Generalization for Crisis Classification
Authors: Pingchuan Ma, Chengshuai Zhao, Bohan Jiang, Saketh Vishnubhatla, Ujun Jeong, Alimohammad Beigi, Adrienne Raglin, Huan Liu,
Abstract summary: Crisis classification in social media aims to extract actionable disaster-related information from posts.<n>Existing approaches primarily leverage deep learning to fuse textual and visual cues for crisis classification.<n>We introduce a causality-guided multimodal domain generalization framework that combines adversarial disentanglement with unified representation learning.
Score: 16.165585394745786
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Crisis classification in social media aims to extract actionable disaster-related information from multimodal posts, which is a crucial task for enhancing situational awareness and facilitating timely emergency responses. However, the wide variation in crisis types makes achieving generalizable performance across unseen disasters a persistent challenge. Existing approaches primarily leverage deep learning to fuse textual and visual cues for crisis classification, achieving numerically plausible results under in-domain settings. However, they exhibit poor generalization across unseen crisis types because they 1. do not disentangle spurious and causal features, resulting in performance degradation under domain shift, and 2. fail to align heterogeneous modality representations within a shared space, which hinders the direct adaptation of established single-modality domain generalization (DG) techniques to the multimodal setting. To address these issues, we introduce a causality-guided multimodal domain generalization (MMDG) framework that combines adversarial disentanglement with unified representation learning for crisis classification. The adversarial objective encourages the model to disentangle and focus on domain-invariant causal features, leading to more generalizable classifications grounded in stable causal mechanisms. The unified representation aligns features from different modalities within a shared latent space, enabling single-modality DG strategies to be seamlessly extended to multimodal learning. Experiments on the different datasets demonstrate that our approach achieves the best performance in unseen disaster scenarios.

Related papers

Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously [51.03213216886717]
We take the position that the dominant paradigm of General Alignment reaches a structural ceiling in settings with conflicting values.<n>We introduce Edge Alignment as a distinct approach in which systems preserve multi dimensional value structure.
arXiv Detail & Related papers (2026-02-23T16:51:43Z)
Open-Vocabulary Domain Generalization in Urban-Scene Segmentation [83.15573353963235]
Domain Generalization in Semantic Domain (DG-SS) aims to enable segmentation models to perform robustly in unseen environments.<n>Recent progress in Vision-Language Models (VLMs) has advanced Open-Vocabulary Semantic (OV-SS) by enabling models to recognize a broader range of concepts.<n>Yet, these models remain sensitive to domain shifts and struggle to maintain robustness when deployed in unseen environments.<n>We propose S2-Corr, a state-space-driven text-image correlation refinement mechanism that produces more consistent text-image correlations under distribution changes.
arXiv Detail & Related papers (2026-02-21T14:32:27Z)
Learning Representation and Synergy Invariances: A Povable Framework for Generalized Multimodal Face Anti-Spoofing [85.00865662325954]
Multimodal Face Anti-Spoofing (FAS) methods, which integrate multiple visual modalities, often suffer even more severe performance degradation when deployed in unseen domains.<n>This is mainly due to two overlooked risks that affect cross-domain multimodal generalization.<n>We propose a provable framework, namely Multimodal Representation and Synergy Invariance Learning (RiSe)
arXiv Detail & Related papers (2025-11-18T05:37:06Z)
Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations [43.07575348801021]
Domain Generalization (DG) aims to enhance model robustness in unseen or distributionally shifted target domains through training exclusively on source domains.<n>A key challenge in Multi-modal Domain Generalization (MMDG) has emerged: enabling models trained on multi-modal sources to generalize to unseen target distributions within the same modality set.<n>We propose a novel approach that leverages Unified Representations to map different paired modalities together.
arXiv Detail & Related papers (2025-07-04T05:17:32Z)
Offline Multi-agent Reinforcement Learning via Score Decomposition [51.23590397383217]
offline cooperative multi-agent reinforcement learning (MARL) faces unique challenges due to distributional shifts.<n>This work is the first work to explicitly address the distributional gap between offline and online MARL.
arXiv Detail & Related papers (2025-05-09T11:42:31Z)
Casual Inference via Style Bias Deconfounding for Domain Generalization [28.866189619091227]
We introduce Style Deconfounding Causal Learning, a novel causal inference-based framework designed to explicitly address style as a confounding factor.<n>Our approaches begin with constructing a structural causal model (SCM) tailored to the domain generalization problem and applies a backdoor adjustment strategy to account for style influence.<n>Building on this foundation, we design a style-guided expert module (SGEM) to adaptively clusters style distributions during training, capturing the global confounding style.<n>A back-door causal learning module (BDCL) performs causal interventions during feature extraction, ensuring fair integration of global confounding styles into sample predictions, effectively reducing style bias
arXiv Detail & Related papers (2025-03-21T04:52:31Z)
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing [58.62312400472865]
Multi-modal face anti-spoofing (FAS) has emerged as a prominent research focus.<n>We propose a alignment module between modalities based on mutual information.<n>We employ a dual alignment optimization method that aligns both sub-domain hyperplanes and modality angle margins.
arXiv Detail & Related papers (2025-03-01T10:12:00Z)
Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing [26.901402236963374]
Face Anti-Spoofing (FAS) is crucial for securing face recognition systems against presentation attacks. Many multi-modal FAS approaches have emerged, but they face challenges in generalizing to unseen attacks and deployment conditions.
arXiv Detail & Related papers (2024-02-29T16:06:36Z)
Rethinking Domain Generalization: Discriminability and Generalizability [31.967801550742312]
Domain generalization (DG) endeavors to develop robust models that possess strong generalizability while preserving excellent discriminability. We present a novel framework, Discriminative Microscopic Distribution Alignment(DMDA) DMDA incorporates two core components: Selective Channel Pruning( SCP) and Micro-level Distribution Alignment(MDA)
arXiv Detail & Related papers (2023-09-28T14:45:54Z)
Randomized Adversarial Style Perturbations for Domain Generalization [49.888364462991234]
We propose a novel domain generalization technique, referred to as Randomized Adversarial Style Perturbation (RASP) The proposed algorithm perturbs the style of a feature in an adversarial direction towards a randomly selected class, and makes the model learn against being misled by the unexpected styles observed in unseen target domains. We evaluate the proposed algorithm via extensive experiments on various benchmarks and show that our approach improves domain generalization performance, especially in large-scale benchmarks.
arXiv Detail & Related papers (2023-04-04T17:07:06Z)
Global-Local Regularization Via Distributional Robustness [26.983769514262736]
Deep neural networks are often vulnerable to adversarial examples and distribution shifts. Recent approaches leverage distributional robustness optimization (DRO) to find the most challenging distribution. We propose a novel regularization technique, following the veins of Wasserstein-based DRO framework.
arXiv Detail & Related papers (2022-03-01T15:36:12Z)
Learning Domain Invariant Representations for Generalizable Person Re-Identification [71.35292121563491]
Generalizable person Re-Identification (ReID) has attracted growing attention in recent computer vision community. We introduce causality into person ReID and propose a novel generalizable framework, named Domain Invariant Representations for generalizable person Re-Identification (DIR-ReID)
arXiv Detail & Related papers (2021-03-29T18:59:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.