Heuristic-enhanced Candidates Selection strategy for GPTs tackle Few-Shot Aspect-Based Sentiment Analysis
- URL: http://arxiv.org/abs/2404.06063v2
- Date: Mon, 19 Aug 2024 07:50:54 GMT
- Title: Heuristic-enhanced Candidates Selection strategy for GPTs tackle Few-Shot Aspect-Based Sentiment Analysis
- Authors: Baoxing Jiang, Yujie Wan, Shenggen Ju,
- Abstract summary: The paper designs a Heuristic-enhanced Candidates Selection strategy and further proposes All in One (AiO) model based on it.
The model works in a two-stage, which simultaneously accommodates the accuracy of PLMs and the capability of generalization.
The experimental results demonstrate that the proposed model can better adapt to multiple sub-tasks, and also outperforms the methods that directly utilize GPTs.
- Score: 1.5020330976600738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-Shot Aspect-Based Sentiment Analysis (FSABSA) is an indispensable and highly challenging task in natural language processing. However, methods based on Pre-trained Language Models (PLMs) struggle to accommodate multiple sub-tasks, and methods based on Generative Pre-trained Transformers (GPTs) perform poorly. To address the above issues, the paper designs a Heuristic-enhanced Candidates Selection (HCS) strategy and further proposes All in One (AiO) model based on it. The model works in a two-stage, which simultaneously accommodates the accuracy of PLMs and the generalization capability of GPTs. Specifically, in the first stage, a backbone model based on PLMs generates rough heuristic candidates for the input sentence. In the second stage, AiO leverages LLMs' contextual learning capabilities to generate precise predictions. The study conducted comprehensive comparative and ablation experiments on five benchmark datasets. The experimental results demonstrate that the proposed model can better adapt to multiple sub-tasks, and also outperforms the methods that directly utilize GPTs.
Related papers
- Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process [45.632012199451275]
In-context learning (ICL) is a few-shot learning paradigm that involves learning mappings through input-output pairs.
Existing works are highly dependent on large-scale labeled support sets, not always feasible in practical scenarios.
We introduce the Language Model-based Determinant Point Process (LM-DPP) that simultaneously considers the uncertainty and diversity of unlabeled instances for optimal selection.
arXiv Detail & Related papers (2024-08-04T18:08:15Z) - Enhancing Travel Choice Modeling with Large Language Models: A Prompt-Learning Approach [6.913791588789051]
We introduce a novel prompt-learning-based Large Language Model(LLM) framework that significantly improves prediction accuracy and provides explicit explanations for individual predictions.
We tested the framework's efficacy using two widely used choice datasets: London Passenger Mode Choice (LPMC) and Optima-Mode collected in Switzerland.
The results indicate that the LLM significantly outperforms state-of-the-art deep learning methods and discrete choice models in predicting people's choices.
arXiv Detail & Related papers (2024-06-19T13:46:08Z) - A Comparative Analysis of Pretrained Language Models for Text-to-Speech [13.962029761484022]
State-of-the-art text-to-speech (TTS) systems have utilized pretrained language models (PLMs) to enhance prosody and create more natural-sounding speech.
While PLMs have been extensively researched for natural language understanding (NLU), their impact on TTS has been overlooked.
This study is the first study of its kind to investigate the impact of different PLMs on TTS.
arXiv Detail & Related papers (2023-09-04T13:02:27Z) - Pre-trained Language Models for Keyphrase Generation: A Thorough
Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models.
We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance.
Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z) - Forging Multiple Training Objectives for Pre-trained Language Models via
Meta-Learning [97.28779163988833]
Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling.
We propose textitMOMETAS, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives.
arXiv Detail & Related papers (2022-10-19T04:38:26Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks.
We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain.
We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z) - Generating Training Data with Language Models: Towards Zero-Shot
Language Understanding [35.92571138322246]
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks.
We present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks.
Our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark.
arXiv Detail & Related papers (2022-02-09T16:02:18Z) - Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance.
Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.