Related papers: NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques

NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques

URL: http://arxiv.org/abs/2102.12254v1
Date: Wed, 24 Feb 2021 12:30:09 GMT
Title: NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques
Authors: Gunjan Chhablani, Yash Bhartia, Abheesht Sharma, Harshit Pandey, Shan Suthaharan
Abstract summary: In this paper, we explore simple versions of Token Classification or Span Prediction approaches. We use BERT-based models -- BERT, RoBERTa, and SpanBERT for both approaches. To this end, we investigate results on four hybrid approaches -- Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection.
Score: 0.6850683267295249
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Toxicity detection of text has been a popular NLP task in the recent years. In SemEval-2021 Task-5 Toxic Spans Detection, the focus is on detecting toxic spans within passages. Most state-of-the-art span detection approaches employ various techniques, each of which can be broadly classified into Token Classification or Span Prediction approaches. In our paper, we explore simple versions of both of these approaches and their performance on the task. Specifically, we use BERT-based models -- BERT, RoBERTa, and SpanBERT for both approaches. We also combine these approaches and modify them to bring improvements for Toxic Spans prediction. To this end, we investigate results on four hybrid approaches -- Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection. Additionally, we perform a thorough ablative analysis and analyze our observed results. Our best submission -- a combination of SpanBERT Span Predictor and RoBERTa Token Classifier predictions -- achieves an F1 score of 0.6753 on the test set. Our best post-eval F1 score is 0.6895 on intersection of predicted offsets from top-3 RoBERTa Token Classification checkpoints. These approaches improve the performance by 3% on average than those of the shared baseline models -- RNNSL and SpaCy NER.

Related papers

Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [53.96714099151378]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models. Our method improves its two key components: minority samples identification and the robust training algorithm. Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z)
A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks [7.72751543977484]
This work investigates the effectiveness of BERT-based contextual embeddings in active learning (AL) tasks on cold-start scenarios. Our primary contribution is the proposal of a more robust fine-tuning pipeline - DoTCAL. Our evaluation contrasts BERT-based embeddings with other prevalent text representation paradigms, including Bag of Words (BoW), Latent Semantic Indexing (LSI) and FastText.
arXiv Detail & Related papers (2024-07-24T13:50:21Z)
Combine and Conquer: A Meta-Analysis on Data Shift and Out-of-Distribution Detection [30.377446496559635]
This paper introduces a universal approach to seamlessly combine out-of-distribution (OOD) detection scores. Our framework is easily for future developments in detection scores and stands as the first to combine decision boundaries in this context.
arXiv Detail & Related papers (2024-06-23T08:16:44Z)
JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data. Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation. We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z)
CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling approaches for NER [0.554780083433538]
This paper summarizes the CLaC submission for the MultiCoNER 2 task. We compare two popular approaches for NER, namely Sequence Labeling and Span Prediction. We find that our best Span Prediction system performs slightly better than our best Sequence Labeling system on test data.
arXiv Detail & Related papers (2023-05-05T20:49:40Z)
NP-Match: Towards a New Probabilistic Model for Semi-Supervised Learning [86.60013228560452]
Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match. NP-Match implicitly compares data points when making predictions, and as a result, the prediction of each unlabeled data point is affected by the labeled data points.
arXiv Detail & Related papers (2023-01-31T11:44:45Z)
Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data. We design a simple but effective ensemble-based framework that combines various transfer learning techniques. We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z)
Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems. Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success. We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z)
UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection [0.7197592390105455]
Semeval-2021, Task 5 - Toxic Spans Detection is based on a novel annotation of a subset of the Jigsaw Unintended Bias dataset. For this task, participants had to automatically detect character spans in short comments that render the message as toxic. Our model considers applying Virtual Adversarial Training in a semi-supervised setting during the fine-tuning process of several Transformer-based models.
arXiv Detail & Related papers (2021-04-17T19:42:12Z)
MIPT-NSU-UTMN at SemEval-2021 Task 5: Ensembling Learning with Pre-trained Language Models for Toxic Spans Detection [0.0]
We developed ensemble models using BERT-based neural architectures and post-processing to combine tokens into spans. We evaluated several pre-trained language models using various ensemble techniques for toxic span identification and achieved sizable improvements over our baseline fine-tuned BERT models.
arXiv Detail & Related papers (2021-04-10T11:27:32Z)
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity. We present a discriminative nearest neighbor classification with deep self-attention. We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z)
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning [91.13797346047984]
We introduce ADAHESSIAN, a second order optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates. We show that ADAHESSIAN achieves new state-of-the-art results by a large margin as compared to other adaptive optimization methods.
arXiv Detail & Related papers (2020-06-01T05:00:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.