Text Classification via Large Language Models
- URL: http://arxiv.org/abs/2305.08377v3
- Date: Mon, 9 Oct 2023 15:52:30 GMT
- Title: Text Classification via Large Language Models
- Authors: Xiaofei Sun, Xiaoya Li, Jiwei Li, Fei Wu, Shangwei Guo, Tianwei Zhang
and Guoyin Wang
- Abstract summary: We introduce Clue And Reasoning Prompting (CARP) to address complex linguistic phenomena involved in text classification.
Remarkably, CARP yields new SOTA performances on 4 out of 5 widely-used text-classification benchmarks.
More importantly, we find that CARP delivers impressive abilities on low-resource and domain-adaptation setups.
- Score: 63.1874290788797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the remarkable success of large-scale Language Models (LLMs) such as
GPT-3, their performances still significantly underperform fine-tuned models in
the task of text classification. This is due to (1) the lack of reasoning
ability in addressing complex linguistic phenomena (e.g., intensification,
contrast, irony etc); (2) limited number of tokens allowed in in-context
learning.
In this paper, we introduce Clue And Reasoning Prompting (CARP). CARP adopts
a progressive reasoning strategy tailored to addressing the complex linguistic
phenomena involved in text classification: CARP first prompts LLMs to find
superficial clues (e.g., keywords, tones, semantic relations, references, etc),
based on which a diagnostic reasoning process is induced for final decisions.
To further address the limited-token issue, CARP uses a fine-tuned model on the
supervised dataset for $k$NN demonstration search in the in-context learning,
allowing the model to take the advantage of both LLM's generalization ability
and the task-specific evidence provided by the full labeled dataset.
Remarkably, CARP yields new SOTA performances on 4 out of 5 widely-used
text-classification benchmarks, 97.39 (+1.24) on SST-2, 96.40 (+0.72) on
AGNews, 98.78 (+0.25) on R8 and 96.95 (+0.6) on R52, and a performance
comparable to SOTA on MR (92.39 v.s. 93.3). More importantly, we find that CARP
delivers impressive abilities on low-resource and domain-adaptation setups.
Specifically, using 16 examples per class, CARP achieves comparable
performances to supervised models with 1,024 examples per class.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - ILLUMINER: Instruction-tuned Large Language Models as Few-shot Intent Classifier and Slot Filler [1.9015367254988451]
This study evaluates instruction-tuned models (Instruct-LLMs) on popular benchmark datasets for intent classification (IC) and slot filling (SF)
We introduce ILLUMINER, an approach framing IC and SF as language generation tasks for Instruct-LLMs, with a more efficient SF-prompting method compared to prior work.
A comprehensive comparison with multiple baselines shows that our approach, using the FLAN-T5 11B model, outperforms the state-of-the-art joint IC+SF method and in-context learning with GPT3.5 (175B).
arXiv Detail & Related papers (2024-03-26T09:41:21Z) - Efficient argument classification with compact language models and ChatGPT-4 refinements [0.0]
This paper presents comparative studies between a few deep learning-based models in argument mining.
The main novelty of this paper is the ensemble model which is based on BERT architecture and ChatGPT-4 as fine tuning model.
The presented results show that BERT+ChatGPT-4 outperforms the rest of the models including other Transformer-based and LSTM-based models.
arXiv Detail & Related papers (2024-03-20T16:24:10Z) - Exploring Small Language Models with Prompt-Learning Paradigm for
Efficient Domain-Specific Text Classification [2.410463233396231]
Small language models (SLMs) offer significant customizability, adaptability, and cost-effectiveness for domain-specific tasks.
In few-shot settings when prompt-based model fine-tuning is possible, T5-base, a typical SLM with 220M parameters, achieve approximately 75% accuracy with limited labeled data.
In zero-shot settings with a fixed model, we underscore a pivotal observation that, although the GPT-3.5-turbo equipped with around 154B parameters garners an accuracy of 55.16%, the power of well designed prompts becomes evident.
arXiv Detail & Related papers (2023-09-26T09:24:46Z) - Better Zero-Shot Reasoning with Role-Play Prompting [10.90357246745529]
Role-play prompting consistently surpasses the standard zero-shot approach across most datasets.
This highlights its potential to augment the reasoning capabilities of large language models.
arXiv Detail & Related papers (2023-08-15T11:08:30Z) - Pushing the Limits of ChatGPT on NLP Tasks [79.17291002710517]
Despite the success of ChatGPT, its performances on most NLP tasks are still well below the supervised baselines.
In this work, we looked into the causes, and discovered that its subpar performance was caused by the following factors.
We propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks.
arXiv Detail & Related papers (2023-06-16T09:40:05Z) - Attention is Not Always What You Need: Towards Efficient Classification
of Domain-Specific Text [1.1508304497344637]
For large-scale IT corpora with hundreds of classes organized in a hierarchy, the task of accurate classification of classes at the higher level in the hierarchies is crucial.
In the business world, an efficient and explainable ML model is preferred over an expensive black-box model, especially if the performance increase is marginal.
Despite the widespread use of PLMs, there is a lack of a clear and well-justified need to as why these models are being employed for domain-specific text classification.
arXiv Detail & Related papers (2023-03-31T03:17:23Z) - Large Language Models in the Workplace: A Case Study on Prompt
Engineering for Job Type Classification [58.720142291102135]
This case study investigates the task of job classification in a real-world setting.
The goal is to determine whether an English-language job posting is appropriate for a graduate or entry-level position.
arXiv Detail & Related papers (2023-03-13T14:09:53Z) - FCM: Forgetful Causal Masking Makes Causal Language Models Better
Zero-Shot Learners [139.6321017962092]
We propose a simple technique that significantly boosts the performance of large language models without adding computational cost.
Our key observation is that, by performing the next token prediction task with randomly selected past tokens masked out, we can improve the quality of the learned representations.
Experimental results show that our method also improves PaLM's zero and few-shot performance on a diverse suite of tasks.
arXiv Detail & Related papers (2022-10-24T17:46:57Z) - Few-shot Learning with Multilingual Language Models [66.49496434282564]
We train multilingual autoregressive language models on a balanced corpus covering a diverse set of languages.
Our largest model sets new state of the art in few-shot learning in more than 20 representative languages.
We present a detailed analysis of where the model succeeds and fails, showing in particular that it enables cross-lingual in-context learning.
arXiv Detail & Related papers (2021-12-20T16:52:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.