Unveiling Modality Bias: Automated Sample-Specific Analysis for Multimodal Misinformation Benchmarks
- URL: http://arxiv.org/abs/2511.05883v1
- Date: Sat, 08 Nov 2025 06:48:19 GMT
- Title: Unveiling Modality Bias: Automated Sample-Specific Analysis for Multimodal Misinformation Benchmarks
- Authors: Hehai Lin, Hui Liu, Shilei Cao, Jing Li, Haoliang Li, Wenya Wang,
- Abstract summary: We investigate the design for automated recognition of modality bias at the sample level.<n>To verify the effectiveness, we conduct a human evaluation on two popular benchmarks.
- Score: 46.7779085335442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous multimodal misinformation benchmarks exhibit bias toward specific modalities, allowing detectors to make predictions based solely on one modality. While previous research has quantified bias at the dataset level or manually identified spurious correlations between modalities and labels, these approaches lack meaningful insights at the sample level and struggle to scale to the vast amount of online information. In this paper, we investigate the design for automated recognition of modality bias at the sample level. Specifically, we propose three bias quantification methods based on theories/views of different levels of granularity: 1) a coarse-grained evaluation of modality benefit; 2) a medium-grained quantification of information flow; and 3) a fine-grained causality analysis. To verify the effectiveness, we conduct a human evaluation on two popular benchmarks. Experimental results reveal three interesting findings that provide potential direction toward future research: 1)~Ensembling multiple views is crucial for reliable automated analysis; 2)~Automated analysis is prone to detector-induced fluctuations; and 3)~Different views produce a higher agreement on modality-balanced samples but diverge on biased ones.
Related papers
- Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data Imbalance [4.291589126905706]
In the AutoML domain, test accuracy is heralded as the quintessential metric for evaluating model efficacy.
However, the reliability of test accuracy as the primary performance metric has been called into question.
The distribution of hard samples between training and test sets affects the difficulty levels of those sets.
We propose a benchmarking procedure for comparing hard sample identification methods.
arXiv Detail & Related papers (2024-09-22T11:38:14Z) - Towards Multimodal Sentiment Analysis Debiasing via Bias Purification [21.170000473208372]
Multimodal Sentiment Analysis (MSA) aims to understand human intentions by integrating emotion-related clues from diverse modalities.
MSA task invariably suffers from unplanned dataset biases, particularly multimodal utterance-level label bias and word-level context bias.
We present a Multimodal Counterfactual Inference Sentiment analysis framework based on causality rather than conventional likelihood.
arXiv Detail & Related papers (2024-03-08T03:55:27Z) - General Debiasing for Multimodal Sentiment Analysis [47.05329012210878]
We propose a general debiasing MSA task, which aims to enhance the Out-Of-Distribution (OOD) generalization ability of MSA models.
We employ IPW to reduce the effects of large-biased samples, facilitating robust feature learning for sentiment prediction.
The empirical results demonstrate the superior generalization ability of our proposed framework.
arXiv Detail & Related papers (2023-07-20T00:36:41Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Mitigating Dataset Bias by Using Per-sample Gradient [9.290757451344673]
We propose PGD (Per-sample Gradient-based Debiasing), that comprises three steps: training a model on uniform batch sampling, setting the importance of each sample in proportion to the norm of the sample gradient, and training the model using importance-batch sampling.
Compared with existing baselines for various synthetic and real-world datasets, the proposed method showed state-of-the-art accuracy for a the classification task.
arXiv Detail & Related papers (2022-05-31T11:41:02Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Abnormal-aware Multi-person Evaluation System with Improved Fuzzy
Weighting [0.0]
We choose the two-stage screening method, which consists of rough screening and score-weighted Kendall-$tau$ Distance.
We use Fuzzy Synthetic Evaluation Method(FSE) to determine the significance of scores given by reviewers as well as their reliability.
arXiv Detail & Related papers (2022-05-01T03:42:43Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z) - On conditional versus marginal bias in multi-armed bandits [105.07190334523304]
The bias of the sample means of the arms in multi-armed bandits is an important issue in adaptive data analysis.
We characterize the sign of the conditional bias of monotone functions of the rewards, including the sample mean.
Our results hold for arbitrary conditioning events and leverage natural monotonicity properties of the data collection policy.
arXiv Detail & Related papers (2020-02-19T20:16:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.