UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer
Networks for Offensive Language Detection
- URL: http://arxiv.org/abs/2004.11493v2
- Date: Wed, 10 Jun 2020 20:48:08 GMT
- Title: UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer
Networks for Offensive Language Detection
- Authors: Gregor Wiedemann and Seid Muhie Yimam and Chris Biemann
- Abstract summary: Fine-tuning of pre-trained transformer networks such as BERT yield state-of-the-art results for text classification tasks.
Our RoBERTa-based classifier officially ranks 1st in the SemEval 2020 Task12 for the English language.
- Score: 28.701023986344993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine-tuning of pre-trained transformer networks such as BERT yield
state-of-the-art results for text classification tasks. Typically, fine-tuning
is performed on task-specific training datasets in a supervised manner. One can
also fine-tune in unsupervised manner beforehand by further pre-training the
masked language modeling (MLM) task. Hereby, in-domain data for unsupervised
MLM resembling the actual classification target dataset allows for domain
adaptation of the model. In this paper, we compare current pre-trained
transformer networks with and without MLM fine-tuning on their performance for
offensive language detection. Our MLM fine-tuned RoBERTa-based classifier
officially ranks 1st in the SemEval 2020 Shared Task~12 for the English
language. Further experiments with the ALBERT model even surpass this result.
Related papers
- MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection [18.0885324380572]
We propose a pure Transformer-based SED model with masked-reconstruction based pre-training, termed MAT-SED.
Both the encoder and the context network are jointly fine-tuned in a semi-supervised manner.
arXiv Detail & Related papers (2024-08-16T11:33:16Z) - Structural Self-Supervised Objectives for Transformers [3.018656336329545]
This thesis focuses on improving the pre-training of natural language models using unsupervised raw data.
In the first part, we introduce three alternative pre-training objectives to BERT's Masked Language Modeling (MLM)
In the second part, we proposes self-supervised pre-training tasks that align structurally with downstream applications.
arXiv Detail & Related papers (2023-09-15T09:30:45Z) - Task Residual for Tuning Vision-Language Models [69.22958802711017]
We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes)
TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task.
The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
arXiv Detail & Related papers (2022-11-18T15:09:03Z) - Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask
Training [55.43088293183165]
Recent studies show that pre-trained language models (PLMs) like BERT contain matchingworks that have similar transfer learning performance as the original PLM.
In this paper, we find that the BERTworks have even more potential than these studies have shown.
We train binary masks over model weights on the pre-training tasks, with the aim of preserving the universal transferability of the subnetwork.
arXiv Detail & Related papers (2022-04-24T08:42:47Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Robust Transfer Learning with Pretrained Language Models through
Adapters [40.45102278979193]
Transfer learning with large pretrained language models like BERT has become a dominating approach for most NLP tasks.
We propose a simple yet effective adapter-based approach to mitigate these issues.
Our experiments demonstrate that such a training scheme leads to improved stability and adversarial robustness in transfer learning to various downstream tasks.
arXiv Detail & Related papers (2021-08-05T02:30:13Z) - The Lottery Ticket Hypothesis for Pre-trained BERT Networks [137.99328302234338]
In natural language processing (NLP), enormous pre-trained models like BERT have become the standard starting point for training.
In parallel, work on the lottery ticket hypothesis has shown that models for NLP and computer vision contain smaller matchingworks capable of training in isolation to full accuracy.
We combine these observations to assess whether such trainable, transferrableworks exist in pre-trained BERT models.
arXiv Detail & Related papers (2020-07-23T19:35:39Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.