Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey
- URL: http://arxiv.org/abs/2406.08068v2
- Date: Fri, 16 Aug 2024 10:50:45 GMT
- Title: Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey
- Authors: Hao Yang, Yanyan Zhao, Yang Wu, Shilong Wang, Tian Zheng, Hongbo Zhang, Zongyang Ma, Wanxiang Che, Bing Qin,
- Abstract summary: ChatGPT has opened up immense potential for applying large language models (LLMs) to text-centric multimodal tasks.
It is still unclear how existing LLMs can adapt better to text-centric multimodal sentiment analysis tasks.
- Score: 66.166184609616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared to traditional sentiment analysis, which only considers text, multimodal sentiment analysis needs to consider emotional signals from multimodal sources simultaneously and is therefore more consistent with the way how humans process sentiment in real-world scenarios. It involves processing emotional information from various sources such as natural language, images, videos, audio, physiological signals, etc. However, although other modalities also contain diverse emotional cues, natural language usually contains richer contextual information and therefore always occupies a crucial position in multimodal sentiment analysis. The emergence of ChatGPT has opened up immense potential for applying large language models (LLMs) to text-centric multimodal tasks. However, it is still unclear how existing LLMs can adapt better to text-centric multimodal sentiment analysis tasks. This survey aims to (1) present a comprehensive review of recent research in text-centric multimodal sentiment analysis tasks, (2) examine the potential of LLMs for text-centric multimodal sentiment analysis, outlining their approaches, advantages, and limitations, (3) summarize the application scenarios of LLM-based multimodal sentiment analysis technology, and (4) explore the challenges and potential research directions for multimodal sentiment analysis in the future.
Related papers
- PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis [74.41260927676747]
This paper bridges the gaps by introducing a multimodal conversational Sentiment Analysis (ABSA)
To benchmark the tasks, we construct PanoSent, a dataset annotated both manually and automatically, featuring high quality, large scale, multimodality, multilingualism, multi-scenarios, and covering both implicit and explicit sentiment elements.
To effectively address the tasks, we devise a novel Chain-of-Sentiment reasoning framework, together with a novel multimodal large language model (namely Sentica) and a paraphrase-based verification mechanism.
arXiv Detail & Related papers (2024-08-18T13:51:01Z) - A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks [74.52259252807191]
Multimodal Large Language Models (MLLMs) address the complexities of real-world applications far beyond the capabilities of single-modality systems.
This paper systematically sorts out the applications of MLLM in multimodal tasks such as natural language, vision, and audio.
arXiv Detail & Related papers (2024-08-02T15:14:53Z) - Evaluation of data inconsistency for multi-modal sentiment analysis [20.332527596452625]
Emotion semantic inconsistency is an ubiquitous challenge in multi-modal sentiment analysis.
Our research presents a new challenge and offer valuable insights for the future development of sentiment analysis systems.
arXiv Detail & Related papers (2024-06-05T07:11:56Z) - M2SA: Multimodal and Multilingual Model for Sentiment Analysis of Tweets [4.478789600295492]
This paper transforms an existing textual Twitter sentiment dataset into a multimodal format through a straightforward curation process.
Our work opens up new avenues for sentiment-related research within the research community.
arXiv Detail & Related papers (2024-04-02T09:11:58Z) - WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual
World Knowledge [73.76722241704488]
We propose a plug-in framework named WisdoM to leverage the contextual world knowledge induced from the large vision-language models (LVLMs) for enhanced multimodal sentiment analysis.
We show that our approach has substantial improvements over several state-of-the-art methods.
arXiv Detail & Related papers (2024-01-12T16:08:07Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z) - DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple
Analysis [84.80347062834517]
We introduce DiaASQ, aiming to detect the quadruple of target-aspect-opinion-sentiment in a dialogue.
We manually construct a large-scale high-quality DiaASQ dataset in both Chinese and English languages.
We develop a neural model to benchmark the task, which advances in effectively performing end-to-end quadruple prediction.
arXiv Detail & Related papers (2022-11-10T17:18:20Z) - Multilingual Multimodality: A Taxonomical Survey of Datasets,
Techniques, Challenges and Opportunities [10.721189858694396]
We study the unification of multilingual and multimodal (MultiX) streams.
We review the languages studied, gold or silver data with parallel annotations, and understand how these modalities and languages interact in modeling.
We present an account of the modeling approaches along with their strengths and weaknesses to better understand what scenarios they can be used reliably.
arXiv Detail & Related papers (2022-10-30T21:46:01Z) - A Novel Context-Aware Multimodal Framework for Persian Sentiment
Analysis [19.783517380422854]
We present a first of its kind Persian multimodal dataset comprising more than 800 utterances.
We present a novel context-aware multimodal sentiment analysis framework.
We employ both decision-level (late) and feature-level (early) fusion methods to integrate affective cross-modal information.
arXiv Detail & Related papers (2021-03-03T19:09:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.