UTNLP at SemEval-2021 Task 5: A Comparative Analysis of Toxic Span
Detection using Attention-based, Named Entity Recognition, and Ensemble
Models
- URL: http://arxiv.org/abs/2104.04770v1
- Date: Sat, 10 Apr 2021 13:56:03 GMT
- Title: UTNLP at SemEval-2021 Task 5: A Comparative Analysis of Toxic Span
Detection using Attention-based, Named Entity Recognition, and Ensemble
Models
- Authors: Alireza Salemi, Nazanin Sabri, Emad Kebriaei, Behnam Bahrak, Azadeh
Shakery
- Abstract summary: This paper presents our team's, UTNLP, methodology and results in the SemEval-2021 shared task 5 on toxic spans detection.
The experiments start with keyword-based models and are followed by attention-based, named entity-based, transformers-based, and ensemble models.
Our best approach, an ensemble model, achieves an F1 of 0.684 in the competition's evaluation phase.
- Score: 6.562256987706127
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detecting which parts of a sentence contribute to that sentence's toxicity --
rather than providing a sentence-level verdict of hatefulness -- would increase
the interpretability of models and allow human moderators to better understand
the outputs of the system. This paper presents our team's, UTNLP, methodology
and results in the SemEval-2021 shared task 5 on toxic spans detection. We test
multiple models and contextual embeddings and report the best setting out of
all. The experiments start with keyword-based models and are followed by
attention-based, named entity-based, transformers-based, and ensemble models.
Our best approach, an ensemble model, achieves an F1 of 0.684 in the
competition's evaluation phase.
Related papers
- SASWISE-UE: Segmentation and Synthesis with Interpretable Scalable Ensembles for Uncertainty Estimation [6.082812294410541]
This paper introduces an efficient sub-model ensemble framework aimed at enhancing the interpretability of medical deep learning models.
By generating uncertainty maps, this framework enables end-users to evaluate the reliability of model outputs.
arXiv Detail & Related papers (2024-11-08T04:37:55Z) - An Energy-based Model for Word-level AutoCompletion in Computer-aided Translation [97.3797716862478]
Word-level AutoCompletion (WLAC) is a rewarding yet challenging task in Computer-aided Translation.
Existing work addresses this task through a classification model based on a neural network that maps the hidden vector of the input context into its corresponding label.
This work proposes an energy-based model for WLAC, which enables the context hidden vector to capture crucial information from the source sentence.
arXiv Detail & Related papers (2024-07-29T15:07:19Z) - Information-Theoretic Distillation for Reference-less Summarization [67.51150817011617]
We present a novel framework to distill a powerful summarizer based on the information-theoretic objective for summarization.
We start off from Pythia-2.8B as the teacher model, which is not yet capable of summarization.
We arrive at a compact but powerful summarizer with only 568M parameters that performs competitively against ChatGPT.
arXiv Detail & Related papers (2024-03-20T17:42:08Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models [39.479912987123214]
Self-supervised learning has exhibited a notable ability to solve a wide range of visual or language understanding tasks.
We introduce Fusioner, with a lightweight, transformer-based fusion module, that pairs the frozen visual representation with language concept.
We show that, the proposed fusion approach is effective to any pair of visual and language models, even those pre-trained on a corpus of uni-modal data.
arXiv Detail & Related papers (2022-10-27T02:57:26Z) - COLO: A Contrastive Learning based Re-ranking Framework for One-Stage
Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO.
COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans
Detection [0.7197592390105455]
Semeval-2021, Task 5 - Toxic Spans Detection is based on a novel annotation of a subset of the Jigsaw Unintended Bias dataset.
For this task, participants had to automatically detect character spans in short comments that render the message as toxic.
Our model considers applying Virtual Adversarial Training in a semi-supervised setting during the fine-tuning process of several Transformer-based models.
arXiv Detail & Related papers (2021-04-17T19:42:12Z) - Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake
News Detection [7.29381091750894]
We propose a novel transformer-based language model fine-tuning approach for these fake news detection.
First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases.
Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations.
arXiv Detail & Related papers (2021-01-14T09:05:42Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.