Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech
Using BERToxic
- URL: http://arxiv.org/abs/2104.03506v1
- Date: Thu, 8 Apr 2021 04:46:14 GMT
- Title: Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech
Using BERToxic
- Authors: Yakoob Khan, Weicheng Ma, Soroush Vosoughi
- Abstract summary: This paper describes our approach to the Toxic Spans Detection problem.
We propose BERToxic, a system that fine-tunes a pre-trained BERT model to locate toxic text spans in a given text.
Our system significantly outperformed the provided baseline and achieved an F1-score of 0.683, placing Lone Pine in the 17th place out of 91 teams in the competition.
- Score: 2.4815579733050153
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes our approach to the Toxic Spans Detection problem
(SemEval-2021 Task 5). We propose BERToxic, a system that fine-tunes a
pre-trained BERT model to locate toxic text spans in a given text and utilizes
additional post-processing steps to refine the boundaries. The post-processing
steps involve (1) labeling character offsets between consecutive toxic tokens
as toxic and (2) assigning a toxic label to words that have at least one token
labeled as toxic. Through experiments, we show that these two post-processing
steps improve the performance of our model by 4.16% on the test set. We also
studied the effects of data augmentation and ensemble modeling strategies on
our system. Our system significantly outperformed the provided baseline and
achieved an F1-score of 0.683, placing Lone Pine in the 17th place out of 91
teams in the competition. Our code is made available at
https://github.com/Yakoob-Khan/Toxic-Spans-Detection
Related papers
- Toxic Subword Pruning for Dialogue Response Generation on Large Language Models [51.713448010799986]
We propose textbfToxic Subword textbfPruning (ToxPrune) to prune the subword contained by the toxic words from BPE in trained LLMs.
ToxPrune simultaneously improves the toxic language model NSFW-3B on the task of dialogue response generation obviously.
arXiv Detail & Related papers (2024-10-05T13:30:33Z) - Unveiling the Implicit Toxicity in Large Language Models [77.90933074675543]
The open-endedness of large language models (LLMs) combined with their impressive capabilities may lead to new safety issues when being exploited for malicious use.
We show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via simply zero-shot prompting.
We propose a reinforcement learning (RL) based attacking method to further induce the implicit toxicity in LLMs.
arXiv Detail & Related papers (2023-11-29T06:42:36Z) - Comprehensive Assessment of Toxicity in ChatGPT [49.71090497696024]
We evaluate the toxicity in ChatGPT by utilizing instruction-tuning datasets.
prompts in creative writing tasks can be 2x more likely to elicit toxic responses.
Certain deliberately toxic prompts, designed in earlier studies, no longer yield harmful responses.
arXiv Detail & Related papers (2023-11-03T14:37:53Z) - ToxiSpanSE: An Explainable Toxicity Detection in Code Review Comments [4.949881799107062]
ToxiSpanSE is the first tool to detect toxic spans in the Software Engineering (SE) domain.
Our model achieved the best score with 0.88 $F1$, 0.87 precision, and 0.93 recall for toxic class tokens.
arXiv Detail & Related papers (2023-07-07T04:55:11Z) - Detoxifying Text with MaRCo: Controllable Revision with Experts and
Anti-Experts [57.38912708076231]
We introduce MaRCo, a detoxification algorithm that combines controllable generation and text rewriting methods.
MaRCo uses likelihoods under a non-toxic LM and a toxic LM to find candidate words to mask and potentially replace.
We evaluate our method on several subtle toxicity and microaggressions datasets, and show that it not only outperforms baselines on automatic metrics, but MaRCo's rewrites are preferred 2.1 $times$ more in human evaluation.
arXiv Detail & Related papers (2022-12-20T18:50:00Z) - Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for
Multiple Toxic Span Extraction from Online Comments [1.332560004325655]
This paper describes the system proposed by team Cisco for SemEval-2021 Task 5: Toxic Spans Detection.
We approach this problem primarily in two ways: a sequence tagging approach and a dependency parsing approach.
Our best performing architecture in this approach also proved to be our best performing architecture overall with an F1 score of 0.6922.
arXiv Detail & Related papers (2021-05-28T16:27:49Z) - UoT-UWF-PartAI at SemEval-2021 Task 5: Self Attention Based Bi-GRU with
Multi-Embedding Representation for Toxicity Highlighter [3.0586855806896045]
We propose a self-attention-based gated recurrent unit with a multi-embedding representation of the tokens.
Experimental results show that our proposed approach is very effective in detecting span tokens.
arXiv Detail & Related papers (2021-04-27T13:18:28Z) - UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with
BiLSTM-CRF and Toxic Bert Comment Classification [0.0]
This task aims to build a model for identifying toxic words in a whole posts.
We use the BiLSTM-CRF model combining with Toxic Bert Classification to train the detection model.
Our model achieved 62.23% by F1-score on the Toxic Spans Detection task.
arXiv Detail & Related papers (2021-04-20T16:32:56Z) - UIT-E10dot3 at SemEval-2021 Task 5: Toxic Spans Detection with Named
Entity Recognition and Question-Answering Approaches [0.32228025627337864]
This task asks competitors to extract spans that have toxicity from the given texts, and we have done several analyses to understand its structure before doing experiments.
We solve this task by two approaches, named entity recognition with spaCy library and Question-Answering with RoBERTa combining with ToxicBERT, and the former gains the highest F1-score of 66.99%.
arXiv Detail & Related papers (2021-04-15T11:07:56Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z) - RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment.
We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.