An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad
Prediction
- URL: http://arxiv.org/abs/2311.01713v1
- Date: Fri, 3 Nov 2023 05:00:44 GMT
- Title: An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad
Prediction
- Authors: Junxian Zhou, Haiqin Yang, Ye Junpeng, Yuxuan He and Hao Mou
- Abstract summary: We construct two large Chinese ASQP datasets crawled from multiple online platforms.
The datasets hold several significant characteristics: larger size (each with 10,000+ samples), rich aspect categories, more words per sentence, and higher density than existing ASQP datasets.
We are the first to evaluate the performance of Generative Pre-trained Transformer (GPT) series models on ASQP and exhibit potential issues.
- Score: 6.189770781546809
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Aspect sentiment quad prediction (ASQP) is a critical subtask of aspect-level
sentiment analysis. Current ASQP datasets are characterized by their small size
and low quadruple density, which hinders technical development. To expand
capacity, we construct two large Chinese ASQP datasets crawled from multiple
online platforms. The datasets hold several significant characteristics: larger
size (each with 10,000+ samples) and rich aspect categories, more words per
sentence, and higher density than existing ASQP datasets. Moreover, we are the
first to evaluate the performance of Generative Pre-trained Transformer (GPT)
series models on ASQP and exhibit potential issues. The experiments with
state-of-the-art ASQP baselines underscore the need to explore additional
techniques to address ASQP, as well as the importance of further investigation
into methods to improve the performance of GPTs.
Related papers
- Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo [0.5110571587151475]
'RetChemQA' is a benchmark dataset designed to evaluate the capabilities of machine learning models in the domain of reticular chemistry.
This dataset includes both single-hop and multi-hop question-answer pairs, encompassing approximately 45,000 Q&As for each type.
The questions have been extracted from an extensive corpus of literature containing about 2,530 research papers from publishers including NAS, ACS, RSC, Elsevier, and Nature Publishing Group.
arXiv Detail & Related papers (2024-05-03T14:29:54Z) - TextSquare: Scaling up Text-Centric Visual Instruction Tuning [64.55339431760727]
We introduce a new approach for creating a massive, high-quality instruction-tuning dataset, Square-10M.
Our model, TextSquare, considerably surpasses open-source previous state-of-the-art Text-centric MLLMs.
It even outperforms top-tier models like GPT4V and Gemini in 6 of 10 text-centric benchmarks.
arXiv Detail & Related papers (2024-04-19T11:38:08Z) - Adaptive Data Augmentation for Aspect Sentiment Quad Prediction [21.038795249448675]
Aspect sentiment quad prediction (ASQP) aims to predict the quad sentiment elements for a given sentence.
Data imbalance issue has not received sufficient attention in ASQP task.
We propose an Adaptive Data Augmentation (ADA) framework to tackle the imbalance issue.
arXiv Detail & Related papers (2024-01-12T06:20:56Z) - Gemini vs GPT-4V: A Preliminary Comparison and Combination of
Vision-Language Models Through Qualitative Cases [98.35348038111508]
This paper presents an in-depth comparative study of two pioneering models: Google's Gemini and OpenAI's GPT-4V(ision)
The core of our analysis delves into the distinct visual comprehension abilities of each model.
Our findings illuminate the unique strengths and niches of both models.
arXiv Detail & Related papers (2023-12-22T18:59:58Z) - Accelerated materials language processing enabled by GPT [5.518792725397679]
We develop generative transformer (GPT)-enabled pipelines for materials language processing.
First, we develop a GPT-enabled document classification method for screening relevant documents.
Secondly, for NER task, we design an entity-centric prompts, and learning few-shot of them improved the performance.
Finally, we develop an GPT-enabled extractive QA model, which provides improved performance and shows the possibility of automatically correcting annotations.
arXiv Detail & Related papers (2023-08-18T07:31:13Z) - A Unified One-Step Solution for Aspect Sentiment Quad Prediction [3.428123050377681]
Aspect sentiment quad prediction (ASQP) is a challenging yet significant subtask in aspect-based sentiment analysis.
We release two new datasets for ASQP, which contain the following characteristics: larger size, more words per sample, and higher density.
We propose a unified one-step solution for ASQP, namely One-ASQP, to detect the aspect categories and to identify the aspect-opinion-sentiment triplets simultaneously.
arXiv Detail & Related papers (2023-06-07T05:00:01Z) - GPT-4 Technical Report [116.90398195245983]
GPT-4 is a large-scale, multimodal model which can accept image and text inputs and produce text outputs.
It exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers.
arXiv Detail & Related papers (2023-03-15T17:15:04Z) - QAFactEval: Improved QA-Based Factual Consistency Evaluation for
Summarization [116.56171113972944]
We show that carefully choosing the components of a QA-based metric is critical to performance.
Our solution improves upon the best-performing entailment-based metric and achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-12-16T00:38:35Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.