Rationale-Guided Few-Shot Classification to Detect Abusive Language
- URL: http://arxiv.org/abs/2211.17046v2
- Date: Thu, 27 Jul 2023 19:44:05 GMT
- Title: Rationale-Guided Few-Shot Classification to Detect Abusive Language
- Authors: Punyajoy Saha, Divyanshu Sheth, Kushal Kedia, Binny Mathew, Animesh
Mukherjee
- Abstract summary: We propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language detection.
We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets.
- Score: 5.977278650516324
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Abusive language is a concerning problem in online social media. Past
research on detecting abusive language covers different platforms, languages,
demographies, etc. However, models trained using these datasets do not perform
well in cross-domain evaluation settings. To overcome this, a common strategy
is to use a few samples from the target domain to train models to get better
performance in that domain (cross-domain few-shot training). However, this
might cause the models to overfit the artefacts of those samples. A compelling
solution could be to guide the models toward rationales, i.e., spans of text
that justify the text's label. This method has been found to improve model
performance in the in-domain setting across various NLP tasks. In this paper,
we propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language
detection. We first build a multitask learning setup to jointly learn
rationales, targets, and labels, and find a significant improvement of 6% macro
F1 on the rationale detection task over training solely rationale classifiers.
We introduce two rationale-integrated BERT-based architectures (the RGFS
models) and evaluate our systems over five different abusive language datasets,
finding that in the few-shot classification setting, RGFS-based models
outperform baseline models by about 7% in macro F1 scores and perform
competitively to models finetuned on other source domains. Furthermore,
RGFS-based models outperform LIME/SHAP-based approaches in terms of
plausibility and are close in performance in terms of faithfulness.
Related papers
- Adaptable and Reliable Text Classification using Large Language Models [7.962669028039958]
This paper introduces an adaptable and reliable text classification paradigm, which leverages Large Language Models (LLMs)
We evaluated the performance of several LLMs, machine learning algorithms, and neural network-based architectures on four diverse datasets.
It is shown that the system's performance can be further enhanced through few-shot or fine-tuning strategies.
arXiv Detail & Related papers (2024-05-17T04:05:05Z) - Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Attention is Not Always What You Need: Towards Efficient Classification
of Domain-Specific Text [1.1508304497344637]
For large-scale IT corpora with hundreds of classes organized in a hierarchy, the task of accurate classification of classes at the higher level in the hierarchies is crucial.
In the business world, an efficient and explainable ML model is preferred over an expensive black-box model, especially if the performance increase is marginal.
Despite the widespread use of PLMs, there is a lack of a clear and well-justified need to as why these models are being employed for domain-specific text classification.
arXiv Detail & Related papers (2023-03-31T03:17:23Z) - QAGAN: Adversarial Approach To Learning Domain Invariant Language
Features [0.76146285961466]
We explore adversarial training approach towards learning domain-invariant features.
We are able to achieve $15.2%$ improvement in EM score and $5.6%$ boost in F1 score on out-of-domain validation dataset.
arXiv Detail & Related papers (2022-06-24T17:42:18Z) - CLIN-X: pre-trained language models and a study on cross-task transfer
for concept extraction in the clinical domain [22.846469609263416]
We introduce the pre-trained CLIN-X (Clinical XLM-R) language models and show how CLIN-X outperforms other pre-trained transformer models.
Our studies reveal stable model performance despite a lack of annotated data with improvements of up to 47 F1 points when only 250 labeled sentences are available.
Our results highlight the importance of specialized language models as CLIN-X for concept extraction in non-standard domains.
arXiv Detail & Related papers (2021-12-16T10:07:39Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented
Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries.
We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP.
We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z) - On Universal Black-Box Domain Adaptation [53.7611757926922]
We study an arguably least restrictive setting of domain adaptation in a sense of practical deployment.
Only the interface of source model is available to the target domain, and where the label-space relations between the two domains are allowed to be different and unknown.
We propose to unify them into a self-training framework, regularized by consistency of predictions in local neighborhoods of target samples.
arXiv Detail & Related papers (2021-04-10T02:21:09Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.