Related papers: Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced Debiasing

Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced Debiasing

URL: http://arxiv.org/abs/2509.09160v1
Date: Thu, 11 Sep 2025 05:40:53 GMT
Title: Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced Debiasing
Authors: Zhiyue Liu, Fanrong Ma, Xin Ling,
Abstract summary: multimodal sentiment classification seeks to predict sentiment polarity for specific targets from image-text pairs.<n>Existing works often over-rely on textual content and fail to consider dataset biases.<n>We introduce a novel counterfactual-enhanced debiasing framework to reduce such spurious correlations.
Score: 5.0175188046562385
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Target-oriented multimodal sentiment classification seeks to predict sentiment polarity for specific targets from image-text pairs. While existing works achieve competitive performance, they often over-rely on textual content and fail to consider dataset biases, in particular word-level contextual biases. This leads to spurious correlations between text features and output labels, impairing classification accuracy. In this paper, we introduce a novel counterfactual-enhanced debiasing framework to reduce such spurious correlations. Our framework incorporates a counterfactual data augmentation strategy that minimally alters sentiment-related causal features, generating detail-matched image-text samples to guide the model's attention toward content tied to sentiment. Furthermore, for learning robust features from counterfactual data and prompting model decisions, we introduce an adaptive debiasing contrastive learning mechanism, which effectively mitigates the influence of biased words. Experimental results on several benchmark datasets show that our proposed method outperforms state-of-the-art baselines.

Related papers

Estimating the Influence of Sequentially Correlated Literary Properties in Textual Classification: A Data-Centric Hypothesis-Testing Approach [4.161155428666988]
We introduce a data-centric hypothesis-testing framework to quantify the influence of sequentially correlated literary properties.<n>We compare traditional (word n-grams and character k-mers) and neural (contrastively trained) embeddings in both supervised and unsupervised classification settings.<n>Our results demonstrate that controlling for sequential correlation is essential for reducing false positives.
arXiv Detail & Related papers (2024-11-07T18:28:40Z)
Common-Sense Bias Modeling for Classification Tasks [15.683471433842492]
We propose a novel framework to extract comprehensive biases in image datasets based on textual descriptions.<n>Our method uncovers novel model biases in multiple image benchmark datasets.<n>The discovered bias can be mitigated by simple data re-weighting to de-correlate the features.
arXiv Detail & Related papers (2024-01-24T03:56:07Z)
Debiasing Stance Detection Models with Counterfactual Reasoning and Adversarial Bias Learning [15.68462203989933]
Stance detection models tend to rely on dataset bias in the text part as a shortcut. We propose an adversarial bias learning module to model the bias more accurately.
arXiv Detail & Related papers (2022-12-20T16:20:56Z)
Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation [44.319739489968164]
Deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks. In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution. We propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly.
arXiv Detail & Related papers (2022-05-25T09:08:35Z)
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video. Recent studies have found that current benchmark datasets may have obvious moment annotation biases. We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z)
Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race. Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables. This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z)
Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective. We propose to augment the input sentences in the training data with their corresponding predicate-argument structures. We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis. We learn sentiment, aspect> joint topic embeddings in the word embedding space. We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
Dynamic Semantic Matching and Aggregation Network for Few-shot Intent Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances. Semantic components are distilled from utterances via multi-head self-attention. Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z)
Identifying Spurious Correlations for Robust Text Classification [9.457737910527829]
We propose a method to distinguish spurious and genuine correlations in text classification. We use features derived from treatment effect estimators to distinguish spurious correlations from "genuine" ones. Experiments on four datasets suggest that using this approach to inform feature selection also leads to more robust classification.
arXiv Detail & Related papers (2020-10-06T03:49:22Z)
Addressing Class Imbalance in Scene Graph Parsing by Learning to Contrast and Score [65.18522219013786]
Scene graph parsing aims to detect objects in an image scene and recognize their relations. Recent approaches have achieved high average scores on some popular benchmarks, but fail in detecting rare relations. This paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem.
arXiv Detail & Related papers (2020-09-28T13:57:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.