BEIKE NLP at SemEval-2022 Task 4: Prompt-Based Paragraph Classification
for Patronizing and Condescending Language Detection
- URL: http://arxiv.org/abs/2208.01312v1
- Date: Tue, 2 Aug 2022 08:38:47 GMT
- Title: BEIKE NLP at SemEval-2022 Task 4: Prompt-Based Paragraph Classification
for Patronizing and Condescending Language Detection
- Authors: Yong Deng, Chenxiao Dou, Liangyu Chen, Deqiang Miao, Xianghui Sun,
Baochang Ma, Xiangang Li
- Abstract summary: PCL detection task is aimed at identifying language that is patronizing or condescending towards vulnerable communities in the general media.
In this paper, we give an introduction to our solution, which exploits the power of prompt-based learning on paragraph classification.
- Score: 13.944149742291788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: PCL detection task is aimed at identifying and categorizing language that is
patronizing or condescending towards vulnerable communities in the general
media.Compared to other NLP tasks of paragraph classification, the negative
language presented in the PCL detection task is usually more implicit and
subtle to be recognized, making the performance of common text-classification
approaches disappointed. Targeting the PCL detection problem in SemEval-2022
Task 4, in this paper, we give an introduction to our team's solution, which
exploits the power of prompt-based learning on paragraph classification. We
reformulate the task as an appropriate cloze prompt and use pre-trained Masked
Language Models to fill the cloze slot. For the two subtasks, binary
classification and multi-label classification, DeBERTa model is adopted and
fine-tuned to predict masked label words of task-specific prompts. On the
evaluation dataset, for binary classification, our approach achieves an
F1-score of 0.6406; for multi-label classification, our approach achieves an
macro-F1-score of 0.4689 and ranks first in the leaderboard.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Blueprinting the Future: Automatic Item Categorization using
Hierarchical Zero-Shot and Few-Shot Classifiers [6.907552533477328]
This study unveils a novel approach employing the zero-shot and few-shot Generative Pretrained Transformer (GPT) for hierarchical item categorization.
The hierarchical nature of examination blueprints is navigated seamlessly, allowing for a tiered classification of items across multiple levels.
An initial simulation with artificial data demonstrates the efficacy of this method, achieving an average accuracy of 92.91% measured by the F1 score.
arXiv Detail & Related papers (2023-12-06T15:51:49Z) - Token Prediction as Implicit Classification to Identify LLM-Generated
Text [37.89852204279844]
This paper introduces a novel approach for identifying the possible large language models (LLMs) involved in text generation.
Instead of adding an additional classification layer to a base LM, we reframe the classification task as a next-token prediction task.
We utilize the Text-to-Text Transfer Transformer (T5) model as the backbone for our experiments.
arXiv Detail & Related papers (2023-11-15T06:33:52Z) - Slot Induction via Pre-trained Language Model Probing and Multi-level
Contrastive Learning [62.839109775887025]
Slot Induction (SI) task whose objective is to induce slot boundaries without explicit knowledge of token-level slot annotations.
We propose leveraging Unsupervised Pre-trained Language Model (PLM) Probing and Contrastive Learning mechanism to exploit unsupervised semantic knowledge extracted from PLM.
Our approach is shown to be effective in SI task and capable of bridging the gaps with token-level supervised models on two NLU benchmark datasets.
arXiv Detail & Related papers (2023-08-09T05:08:57Z) - Automated Few-shot Classification with Instruction-Finetuned Language
Models [76.69064714392165]
We show that AuT-Few outperforms state-of-the-art few-shot learning methods.
We also show that AuT-Few is the best ranking method across datasets on the RAFT few-shot benchmark.
arXiv Detail & Related papers (2023-05-21T21:50:27Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Task-Specific Embeddings for Ante-Hoc Explainable Text Classification [6.671252951387647]
We propose an alternative training objective in which we learn task-specific embeddings of text.
Our proposed objective learns embeddings such that all texts that share the same target class label should be close together.
We present extensive experiments which show that the benefits of ante-hoc explainability and incremental learning come at no cost in overall classification accuracy.
arXiv Detail & Related papers (2022-11-30T19:56:25Z) - Enabling Classifiers to Make Judgements Explicitly Aligned with Human
Values [73.82043713141142]
Many NLP classification tasks, such as sexism/racism detection or toxicity detection, are based on human values.
We introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command.
arXiv Detail & Related papers (2022-10-14T09:10:49Z) - UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles
for Detecting Patronizing and Condescending Language [0.0]
Patronizing and condescending language (PCL) is everywhere, but rarely is the focus on its use by media towards vulnerable communities.
In this paper, we describe our system for detecting such language which was submitted to SemEval 2022 Task 4: Patronizing and Condescending Language Detection.
arXiv Detail & Related papers (2022-04-18T13:22:10Z) - Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot
Classification [15.575483080819563]
We propose Automatic Multi-Label Prompting (AMuLaP) to automatically select label mappings for few-shot text classification with prompting.
Our method exploits one-to-many label mappings and a statistics-based algorithm to select label mappings given a prompt template.
Our experiments demonstrate that AMuLaP achieves competitive performance on the GLUE benchmark without human effort or external resources.
arXiv Detail & Related papers (2022-04-13T11:15:52Z) - Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks.
We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.