MIPT-NSU-UTMN at SemEval-2021 Task 5: Ensembling Learning with
Pre-trained Language Models for Toxic Spans Detection
- URL: http://arxiv.org/abs/2104.04739v1
- Date: Sat, 10 Apr 2021 11:27:32 GMT
- Title: MIPT-NSU-UTMN at SemEval-2021 Task 5: Ensembling Learning with
Pre-trained Language Models for Toxic Spans Detection
- Authors: Mikhail Kotyushev, Anna Glazkova, Dmitry Morozov
- Abstract summary: We developed ensemble models using BERT-based neural architectures and post-processing to combine tokens into spans.
We evaluated several pre-trained language models using various ensemble techniques for toxic span identification and achieved sizable improvements over our baseline fine-tuned BERT models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes our system for SemEval-2021 Task 5 on Toxic Spans
Detection. We developed ensemble models using BERT-based neural architectures
and post-processing to combine tokens into spans. We evaluated several
pre-trained language models using various ensemble techniques for toxic span
identification and achieved sizable improvements over our baseline fine-tuned
BERT models. Finally, our system obtained a F1-score of 67.55% on test data.
Related papers
- Ensembling Finetuned Language Models for Text Classification [55.15643209328513]
Finetuning is a common practice across different communities to adapt pretrained models to particular tasks.
ensembles of neural networks are typically used to boost performance and provide reliable uncertainty estimates.
We present a metadataset with predictions from five large finetuned models on six datasets and report results of different ensembling strategies.
arXiv Detail & Related papers (2024-10-25T09:15:54Z) - Multilingual E5 Text Embeddings: A Technical Report [63.503320030117145]
Three embedding models of different sizes are provided, offering a balance between the inference efficiency and embedding quality.
We introduce a new instruction-tuned embedding model, whose performance is on par with state-of-the-art, English-only models of similar sizes.
arXiv Detail & Related papers (2024-02-08T13:47:50Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - UoB at SemEval-2021 Task 5: Extending Pre-Trained Language Models to
Include Task and Domain-Specific Information for Toxic Span Prediction [0.8376091455761259]
Toxicity is pervasive in social media and poses a major threat to the health of online communities.
Recent introduction of pre-trained language models, which have achieved state-of-the-art results in many NLP tasks, has transformed the way in which we approach natural language processing.
arXiv Detail & Related papers (2021-10-07T18:29:06Z) - The USYD-JD Speech Translation System for IWSLT 2021 [85.64797317290349]
This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task.
We trained our models with the officially provided ASR and MT datasets.
To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning.
arXiv Detail & Related papers (2021-07-24T09:53:34Z) - Methods for Detoxification of Texts for the Russian Language [55.337471467610094]
We introduce the first study of automatic detoxification of Russian texts to combat offensive language.
We test two types of models - unsupervised approach that performs local corrections and supervised approach based on pretrained language GPT-2 model.
The results show that the tested approaches can be successfully used for detoxification, although there is room for improvement.
arXiv Detail & Related papers (2021-05-19T10:37:44Z) - UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans
Detection [0.7197592390105455]
Semeval-2021, Task 5 - Toxic Spans Detection is based on a novel annotation of a subset of the Jigsaw Unintended Bias dataset.
For this task, participants had to automatically detect character spans in short comments that render the message as toxic.
Our model considers applying Virtual Adversarial Training in a semi-supervised setting during the fine-tuning process of several Transformer-based models.
arXiv Detail & Related papers (2021-04-17T19:42:12Z) - UTNLP at SemEval-2021 Task 5: A Comparative Analysis of Toxic Span
Detection using Attention-based, Named Entity Recognition, and Ensemble
Models [6.562256987706127]
This paper presents our team's, UTNLP, methodology and results in the SemEval-2021 shared task 5 on toxic spans detection.
The experiments start with keyword-based models and are followed by attention-based, named entity-based, transformers-based, and ensemble models.
Our best approach, an ensemble model, achieves an F1 of 0.684 in the competition's evaluation phase.
arXiv Detail & Related papers (2021-04-10T13:56:03Z) - NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based
Token Classification and Span Prediction Techniques [0.6850683267295249]
In this paper, we explore simple versions of Token Classification or Span Prediction approaches.
We use BERT-based models -- BERT, RoBERTa, and SpanBERT for both approaches.
To this end, we investigate results on four hybrid approaches -- Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection.
arXiv Detail & Related papers (2021-02-24T12:30:09Z) - Yseop at SemEval-2020 Task 5: Cascaded BERT Language Model for
Counterfactual Statement Analysis [0.0]
We use a BERT base model for the classification task and build a hybrid BERT Multi-Layer Perceptron system to handle the sequence identification task.
Our experiments show that while introducing syntactic and semantic features does little in improving the system in the classification task, using these types of features as cascaded linear inputs to fine-tune the sequence-delimiting ability of the model ensures it outperforms other similar-purpose complex systems like BiLSTM-CRF in the second task.
arXiv Detail & Related papers (2020-05-18T08:19:18Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.