Sentiment Analysis in the Era of Large Language Models: A Reality Check
- URL: http://arxiv.org/abs/2305.15005v1
- Date: Wed, 24 May 2023 10:45:25 GMT
- Title: Sentiment Analysis in the Era of Large Language Models: A Reality Check
- Authors: Wenxuan Zhang, Yue Deng, Bing Liu, Sinno Jialin Pan, Lidong Bing
- Abstract summary: This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
- Score: 69.97942065617664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentiment analysis (SA) has been a long-standing research area in natural
language processing. It can offer rich insights into human sentiments and
opinions and has thus seen considerable interest from both academia and
industry. With the advent of large language models (LLMs) such as ChatGPT,
there is a great potential for their employment on SA problems. However, the
extent to which existing LLMs can be leveraged for different sentiment analysis
tasks remains unclear. This paper aims to provide a comprehensive investigation
into the capabilities of LLMs in performing various sentiment analysis tasks,
from conventional sentiment classification to aspect-based sentiment analysis
and multifaceted analysis of subjective texts. We evaluate performance across
13 tasks on 26 datasets and compare the results against small language models
(SLMs) trained on domain-specific datasets. Our study reveals that while LLMs
demonstrate satisfactory performance in simpler tasks, they lag behind in more
complex tasks requiring deeper understanding or structured sentiment
information. However, LLMs significantly outperform SLMs in few-shot learning
settings, suggesting their potential when annotation resources are limited. We
also highlight the limitations of current evaluation practices in assessing
LLMs' SA abilities and propose a novel benchmark, \textsc{SentiEval}, for a
more comprehensive and realistic evaluation. Data and code during our
investigations are available at
\url{https://github.com/DAMO-NLP-SG/LLM-Sentiment}.
Related papers
- The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks.
This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z) - Do Large Language Models Possess Sensitive to Sentiment? [18.88126980975737]
Large Language Models (LLMs) have recently displayed their extraordinary capabilities in language understanding.
This paper investigates the ability of LLMs to detect and react to sentiment in text modal.
arXiv Detail & Related papers (2024-09-04T01:40:20Z) - Decompose and Aggregate: A Step-by-Step Interpretable Evaluation Framework [75.81096662788254]
Large Language Models (LLMs) are scalable and economical evaluators.
The question of how reliable these evaluators are has emerged as a crucial research question.
We propose Decompose and Aggregate, which breaks down the evaluation process into different stages based on pedagogical practices.
arXiv Detail & Related papers (2024-05-24T08:12:30Z) - Benchmarking LLMs on the Semantic Overlap Summarization Task [9.656095701778975]
This paper comprehensively evaluates Large Language Models (LLMs) on the Semantic Overlap Summarization (SOS) task.
We report well-established metrics like ROUGE, BERTscore, and SEM-F1$ on two different datasets of alternative narratives.
arXiv Detail & Related papers (2024-02-26T20:33:50Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Exploring the Potential of Large Language Models in Computational Argumentation [54.85665903448207]
Large language models (LLMs) have demonstrated impressive capabilities in understanding context and generating natural language.
This work aims to embark on an assessment of LLMs, such as ChatGPT, Flan models, and LLaMA2 models, in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-11-15T15:12:15Z) - LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis [18.775126929754833]
Thematic analysis (TA) has been widely used for analyzing qualitative data in many disciplines and fields.
Human coders develop and deepen their data interpretation and coding over multiple iterations, making TA labor-intensive and time-consuming.
We propose a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct TA with in-context learning (ICL)
arXiv Detail & Related papers (2023-10-23T17:05:59Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.