Adaptive Data Augmentation for Aspect Sentiment Quad Prediction
- URL: http://arxiv.org/abs/2401.06394v1
- Date: Fri, 12 Jan 2024 06:20:56 GMT
- Title: Adaptive Data Augmentation for Aspect Sentiment Quad Prediction
- Authors: Wenyuan Zhang, Xinghua Zhang, Shiyao Cui, Kun Huang, Xuebin Wang and
Tingwen Liu
- Abstract summary: Aspect sentiment quad prediction (ASQP) aims to predict the quad sentiment elements for a given sentence.
Data imbalance issue has not received sufficient attention in ASQP task.
We propose an Adaptive Data Augmentation (ADA) framework to tackle the imbalance issue.
- Score: 21.038795249448675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Aspect sentiment quad prediction (ASQP) aims to predict the quad sentiment
elements for a given sentence, which is a critical task in the field of
aspect-based sentiment analysis. However, the data imbalance issue has not
received sufficient attention in ASQP task. In this paper, we divide the issue
into two-folds, quad-pattern imbalance and aspect-category imbalance, and
propose an Adaptive Data Augmentation (ADA) framework to tackle the imbalance
issue. Specifically, a data augmentation process with a condition function
adaptively enhances the tail quad patterns and aspect categories, alleviating
the data imbalance in ASQP. Following previous studies, we also further explore
the generative framework for extracting complete quads by introducing the
category prior knowledge and syntax-guided decoding target. Experimental
results demonstrate that data augmentation for imbalance in ASQP task can
improve the performance, and the proposed ADA method is superior to naive data
oversampling.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment [38.12600984070689]
Action Quality Assessment (AQA) is pivotal for quantifying actions across domains like sports and medical care.
Existing methods often rely on pre-trained backbones from large-scale action recognition datasets to boost performance on smaller AQA datasets.
We propose Coarse-to-Fine Instruction Alignment (CoFInAl) to align AQA with broader pre-trained tasks by reformulating it as a coarse-to-fine classification task.
arXiv Detail & Related papers (2024-04-22T09:03:21Z) - Self-Consistent Reasoning-based Aspect-Sentiment Quad Prediction with Extract-Then-Assign Strategy [17.477542644785483]
We propose Self-Consistent Reasoning-based Aspect-sentiment quadruple Prediction (SCRAP)
SCRAP optimize its model to generate reasonings and the corresponding sentiment quadruplets in sequence.
In the end, SCRAP significantly improves the model's ability to handle complex reasoning tasks and correctly predict quadruplets through consistency voting.
arXiv Detail & Related papers (2024-03-01T08:34:02Z) - Few-shot learning for COVID-19 Chest X-Ray Classification with
Imbalanced Data: An Inter vs. Intra Domain Study [49.5374512525016]
Medical image datasets are essential for training models used in computer-aided diagnosis, treatment planning, and medical research.
Some challenges are associated with these datasets, including variability in data distribution, data scarcity, and transfer learning issues when using models pre-trained from generic images.
We propose a methodology based on Siamese neural networks in which a series of techniques are integrated to mitigate the effects of data scarcity and distribution imbalance.
arXiv Detail & Related papers (2024-01-18T16:59:27Z) - An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad
Prediction [6.189770781546809]
We construct two large Chinese ASQP datasets crawled from multiple online platforms.
The datasets hold several significant characteristics: larger size (each with 10,000+ samples), rich aspect categories, more words per sentence, and higher density than existing ASQP datasets.
We are the first to evaluate the performance of Generative Pre-trained Transformer (GPT) series models on ASQP and exhibit potential issues.
arXiv Detail & Related papers (2023-11-03T05:00:44Z) - Balanced Classification: A Unified Framework for Long-Tailed Object
Detection [74.94216414011326]
Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories.
We introduce a unified framework called BAlanced CLassification (BACL), which enables adaptive rectification of inequalities caused by disparities in category distribution.
BACL consistently achieves performance improvements across various datasets with different backbones and architectures.
arXiv Detail & Related papers (2023-08-04T09:11:07Z) - Tokenization Consistency Matters for Generative Models on Extractive NLP
Tasks [54.306234256074255]
We identify the issue of tokenization inconsistency that is commonly neglected in training generative models.
This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently.
We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets.
arXiv Detail & Related papers (2022-12-19T23:33:21Z) - Adaptive Ranking-based Sample Selection for Weakly Supervised
Class-imbalanced Text Classification [4.151073288078749]
We propose Adaptive Ranking-based Sample Selection (ARS2) to alleviate the data imbalance issue in the weak supervision (WS) paradigm.
ARS2 calculates a probabilistic margin score based on the output of the current model to measure and rank the cleanliness of each data point.
Experiments show that ARS2 outperformed the state-of-the-art imbalanced learning and WS methods, leading to a 2%-57.8% improvement on their F1-score.
arXiv Detail & Related papers (2022-10-06T17:49:22Z) - Data Augmentation Imbalance For Imbalanced Attribute Classification [60.71438625139922]
We propose a new re-sampling algorithm called: data augmentation imbalance (DAI) to explicitly enhance the ability to discriminate the fewer attributes.
Our DAI algorithm achieves state-of-the-art results, based on pedestrian attribute datasets.
arXiv Detail & Related papers (2020-04-19T20:43:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.