Multimodal Sentiment Analysis Based on Causal Reasoning
- URL: http://arxiv.org/abs/2412.07292v1
- Date: Tue, 10 Dec 2024 08:21:19 GMT
- Title: Multimodal Sentiment Analysis Based on Causal Reasoning
- Authors: Fuhai Chen, Pengpeng Huang, Xuri Ge, Jie Huang, Zishuo Bao,
- Abstract summary: We propose a novel CounterFactual Multimodal Sentiment Analysis framework (CF-MSA) using causal counterfactual inference to construct multimodal sentiment causal inference.
Experimental results on two public datasets, MVSA-Single and MVSA-Multiple, demonstrate that the proposed CF-MSA has superior debiasing capability and achieves new state-of-the-art performances.
- Score: 6.610016449061257
- License:
- Abstract: With the rapid development of multimedia, the shift from unimodal textual sentiment analysis to multimodal image-text sentiment analysis has obtained academic and industrial attention in recent years. However, multimodal sentiment analysis is affected by unimodal data bias, e.g., text sentiment is misleading due to explicit sentiment semantic, leading to low accuracy in the final sentiment classification. In this paper, we propose a novel CounterFactual Multimodal Sentiment Analysis framework (CF-MSA) using causal counterfactual inference to construct multimodal sentiment causal inference. CF-MSA mitigates the direct effect from unimodal bias and ensures heterogeneity across modalities by differentiating the treatment variables between modalities. In addition, considering the information complementarity and bias differences between modalities, we propose a new optimisation objective to effectively integrate different modalities and reduce the inherent bias from each modality. Experimental results on two public datasets, MVSA-Single and MVSA-Multiple, demonstrate that the proposed CF-MSA has superior debiasing capability and achieves new state-of-the-art performances. We will release the code and datasets to facilitate future research.
Related papers
- Asymmetric Reinforcing against Multi-modal Representation Bias [59.685072206359855]
We propose an Asymmetric Reinforcing method against Multimodal representation bias (ARM)
Our ARM dynamically reinforces the weak modalities while maintaining the ability to represent dominant modalities through conditional mutual information.
We have significantly improved the performance of multimodal learning, making notable progress in mitigating imbalanced multimodal learning.
arXiv Detail & Related papers (2025-01-02T13:00:06Z) - Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning [21.127950337002776]
Multimodal Sentiment Analysis (MSA) is an important research area that aims to understand and recognize human sentiment through multiple modalities.
We propose a Hierarchical Representation Learning Framework (HRLF) for the task under uncertain missing modalities.
We show that HRLF significantly improves MSA performance under uncertain modality missing cases.
arXiv Detail & Related papers (2024-11-05T04:04:41Z) - Evaluation of data inconsistency for multi-modal sentiment analysis [20.332527596452625]
Emotion semantic inconsistency is an ubiquitous challenge in multi-modal sentiment analysis.
Our research presents a new challenge and offer valuable insights for the future development of sentiment analysis systems.
arXiv Detail & Related papers (2024-06-05T07:11:56Z) - Trustworthy Multimodal Fusion for Sentiment Analysis in Ordinal Sentiment Space [4.58550667556734]
Multimodal video sentiment analysis aims to integrate multiple modal information to analyze the opinions and attitudes of speakers.
Previous approaches treat different modalities equally, largely ignoring their different contributions.
We propose a trustworthy multimodal sentiment ordinal network (TMSON) to improve performance in sentiment analysis.
arXiv Detail & Related papers (2024-04-13T08:15:57Z) - Towards Multimodal Sentiment Analysis Debiasing via Bias Purification [21.170000473208372]
Multimodal Sentiment Analysis (MSA) aims to understand human intentions by integrating emotion-related clues from diverse modalities.
MSA task invariably suffers from unplanned dataset biases, particularly multimodal utterance-level label bias and word-level context bias.
We present a Multimodal Counterfactual Inference Sentiment analysis framework based on causality rather than conventional likelihood.
arXiv Detail & Related papers (2024-03-08T03:55:27Z) - WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual
World Knowledge [73.76722241704488]
We propose a plug-in framework named WisdoM to leverage the contextual world knowledge induced from the large vision-language models (LVLMs) for enhanced multimodal sentiment analysis.
We show that our approach has substantial improvements over several state-of-the-art methods.
arXiv Detail & Related papers (2024-01-12T16:08:07Z) - Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment
Analysis [56.84237932819403]
This paper aims to estimate and mitigate the bad effect of textual modality for strong OOD generalization.
Inspired by this, we devise a model-agnostic counterfactual framework for multimodal sentiment analysis.
arXiv Detail & Related papers (2022-07-24T03:57:40Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - Transformer-based Multi-Aspect Modeling for Multi-Aspect Multi-Sentiment
Analysis [56.893393134328996]
We propose a novel Transformer-based Multi-aspect Modeling scheme (TMM), which can capture potential relations between multiple aspects and simultaneously detect the sentiment of all aspects in a sentence.
Our method achieves noticeable improvements compared with strong baselines such as BERT and RoBERTa.
arXiv Detail & Related papers (2020-11-01T11:06:31Z) - MISA: Modality-Invariant and -Specific Representations for Multimodal
Sentiment Analysis [48.776247141839875]
We propose a novel framework, MISA, which projects each modality to two distinct subspaces.
The first subspace is modality-invariant, where the representations across modalities learn their commonalities and reduce the modality gap.
Our experiments on popular sentiment analysis benchmarks, MOSI and MOSEI, demonstrate significant gains over state-of-the-art models.
arXiv Detail & Related papers (2020-05-07T15:13:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.