Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study
- URL: http://arxiv.org/abs/2304.04339v2
- Date: Sat, 17 Feb 2024 10:08:06 GMT
- Title: Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study
- Authors: Zengzhi Wang, Qiming Xie, Yi Feng, Zixiang Ding, Zinong Yang, Rui Xia
- Abstract summary: ChatGPT has drawn great attention from both the research community and the public.
We provide a preliminary evaluation of ChatGPT on the understanding of emphopinions, emphsentiments, and emphemotions contained in the text.
- Score: 31.719155787410685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, ChatGPT has drawn great attention from both the research community
and the public. We are particularly interested in whether it can serve as a
universal sentiment analyzer. To this end, in this work, we provide a
preliminary evaluation of ChatGPT on the understanding of \emph{opinions},
\emph{sentiments}, and \emph{emotions} contained in the text. Specifically, we
evaluate it in three settings, including \emph{standard} evaluation,
\emph{polarity shift} evaluation and \emph{open-domain} evaluation. We conduct
an evaluation on 7 representative sentiment analysis tasks covering 17
benchmark datasets and compare ChatGPT with fine-tuned BERT and corresponding
state-of-the-art (SOTA) models on them. We also attempt several popular
prompting techniques to elicit the ability further. Moreover, we conduct human
evaluation and present some qualitative case studies to gain a deep
comprehension of its sentiment analysis capabilities.
Related papers
- OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization [52.720711541731205]
We present OpinSummEval, a dataset comprising human judgments and outputs from 14 opinion summarization models.
Our findings indicate that metrics based on neural networks generally outperform non-neural ones.
arXiv Detail & Related papers (2023-10-27T13:09:54Z) - Leveraging ChatGPT As Text Annotation Tool For Sentiment Analysis [6.596002578395151]
ChatGPT is a new product of OpenAI and has emerged as the most popular AI product.
This study explores the use of ChatGPT as a tool for data labeling for different sentiment analysis tasks.
arXiv Detail & Related papers (2023-06-18T12:20:42Z) - Large Language Models are not Fair Evaluators [60.27164804083752]
We find that the quality ranking of candidate responses can be easily hacked by altering their order of appearance in the context.
This manipulation allows us to skew the evaluation result, making one model appear considerably superior to the other.
We propose a framework with three simple yet effective strategies to mitigate this issue.
arXiv Detail & Related papers (2023-05-29T07:41:03Z) - Multidimensional Evaluation for Text Style Transfer Using ChatGPT [14.799109368073548]
We investigate the potential of ChatGPT as a multidimensional evaluator for the task of emphText Style Transfer
We test its performance on three commonly-used dimensions of text style transfer evaluation: style strength, content preservation, and fluency.
These preliminary results are expected to provide a first glimpse into the role of large language models in the multidimensional evaluation of stylized text generation.
arXiv Detail & Related papers (2023-04-26T11:33:35Z) - Evaluating ChatGPT's Information Extraction Capabilities: An Assessment
of Performance, Explainability, Calibration, and Faithfulness [18.945934162722466]
We focus on assessing the overall ability of ChatGPT using 7 fine-grained information extraction (IE) tasks.
ChatGPT's performance in Standard-IE setting is poor, but it surprisingly exhibits excellent performance in the OpenIE setting.
ChatGPT provides high-quality and trustworthy explanations for its decisions.
arXiv Detail & Related papers (2023-04-23T12:33:18Z) - Human-like Summarization Evaluation with ChatGPT [38.39767193442397]
ChatGPT was able to complete annotations relatively smoothly using Likert scale scoring, pairwise comparison, Pyramid, and binary factuality evaluation.
It outperformed commonly used automatic evaluation metrics on some datasets.
arXiv Detail & Related papers (2023-04-05T16:17:32Z) - Is ChatGPT a Good NLG Evaluator? A Preliminary Study [121.77986688862302]
We provide a preliminary meta-evaluation on ChatGPT to show its reliability as an NLG metric.
Experimental results show that compared with previous automatic metrics, ChatGPT achieves state-of-the-art or competitive correlation with human judgments.
We hope our preliminary study could prompt the emergence of a general-purposed reliable NLG metric.
arXiv Detail & Related papers (2023-03-07T16:57:20Z) - On the Robustness of ChatGPT: An Adversarial and Out-of-distribution
Perspective [67.98821225810204]
We evaluate the robustness of ChatGPT from the adversarial and out-of-distribution perspective.
Results show consistent advantages on most adversarial and OOD classification and translation tasks.
ChatGPT shows astounding performance in understanding dialogue-related texts.
arXiv Detail & Related papers (2023-02-22T11:01:20Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.