XD at SemEval-2020 Task 12: Ensemble Approach to Offensive Language
Identification in Social Media Using Transformer Encoders
- URL: http://arxiv.org/abs/2007.10945v1
- Date: Tue, 21 Jul 2020 17:03:00 GMT
- Title: XD at SemEval-2020 Task 12: Ensemble Approach to Offensive Language
Identification in Social Media Using Transformer Encoders
- Authors: Xiangjue Dong and Jinho D. Choi
- Abstract summary: This paper presents six document classification models using the latest transformer encoders and a high-performing ensemble model for a task of offensive language identification in social media.
Our analysis shows that although the ensemble model significantly improves the accuracy on the development set, the improvement is not as evident on the test set.
- Score: 17.14709845342071
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents six document classification models using the latest
transformer encoders and a high-performing ensemble model for a task of
offensive language identification in social media. For the individual models,
deep transformer layers are applied to perform multi-head attentions. For the
ensemble model, the utterance representations taken from those individual
models are concatenated and fed into a linear decoder to make the final
decisions. Our ensemble model outperforms the individual models and shows up to
8.6% improvement over the individual models on the development set. On the test
set, it achieves macro-F1 of 90.9% and becomes one of the high performing
systems among 85 participants in the sub-task A of this shared task. Our
analysis shows that although the ensemble model significantly improves the
accuracy on the development set, the improvement is not as evident on the test
set.
Related papers
- A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens [53.99177152562075]
Scaling up autoregressive models in vision has not proven as beneficial as in large language models.
We focus on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed order using BERT- or GPT-like transformer architectures.
Our results show that while all models scale effectively in terms of validation loss, their evaluation performance -- measured by FID, GenEval score, and visual quality -- follows different trends.
arXiv Detail & Related papers (2024-10-17T17:59:59Z) - FusionBench: A Comprehensive Benchmark of Deep Model Fusion [78.80920533793595]
Deep model fusion is a technique that unifies the predictions or parameters of several deep neural networks into a single model.
FusionBench is the first comprehensive benchmark dedicated to deep model fusion.
arXiv Detail & Related papers (2024-06-05T13:54:28Z) - Domain Adaptation of Transformer-Based Models using Unlabeled Data for
Relevance and Polarity Classification of German Customer Feedback [1.2999413717930817]
This work explores how efficient transformer-based models are when working with a German customer feedback dataset.
The experimental results show that transformer-based models can reach significant improvements compared to a fastText baseline.
arXiv Detail & Related papers (2022-12-12T08:32:28Z) - Model ensemble instead of prompt fusion: a sample-specific knowledge
transfer method for few-shot prompt tuning [85.55727213502402]
We focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks.
We propose Sample-specific Ensemble of Source Models (SESoM)
SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs.
arXiv Detail & Related papers (2022-10-23T01:33:16Z) - Multi-Source Transformer Architectures for Audiovisual Scene
Classification [14.160670979300628]
The systems we submitted for subtask 1B of the DCASE 2021 challenge, regarding audiovisual scene classification, are described in detail.
They are essentially multi-source transformers employing a combination of auditory and visual features to make predictions.
arXiv Detail & Related papers (2022-10-18T23:42:42Z) - A Study on Transformer Configuration and Training Objective [33.7272660870026]
We propose Bamboo, an idea of using deeper and narrower transformer configurations for masked autoencoder training.
On ImageNet, with such a simple change in configuration, re-designed model achieves 87.1% top-1 accuracy.
On language tasks, re-designed model outperforms BERT with default setting by 1.1 points on average.
arXiv Detail & Related papers (2022-05-21T05:17:11Z) - CAMERO: Consistency Regularized Ensemble of Perturbed Language Models
with Weight Sharing [83.63107444454938]
We propose a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO.
Specifically, we share the weights of bottom layers across all models and apply different perturbations to the hidden representations for different models, which can effectively promote the model diversity.
Our experiments using large language models demonstrate that CAMERO significantly improves the generalization performance of the ensemble model.
arXiv Detail & Related papers (2022-04-13T19:54:51Z) - Model soups: averaging weights of multiple fine-tuned models improves
accuracy without increasing inference time [69.7693300927423]
We show that averaging the weights of multiple models fine-tuned with different hyper parameter configurations improves accuracy and robustness.
We show that the model soup approach extends to multiple image classification and natural language processing tasks.
arXiv Detail & Related papers (2022-03-10T17:03:49Z) - FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings [2.362412515574206]
In this paper, we present our approach for sentiment classification on Spanish-English code-mixed social media data.
We explore both monolingual and multilingual models with the standard fine-tuning method.
Although two-step fine-tuning improves sentiment classification performance over the base model, the large multilingual XLM-RoBERTa model achieves best weighted F1-score.
arXiv Detail & Related papers (2020-07-24T14:48:27Z) - Gestalt: a Stacking Ensemble for SQuAD2.0 [0.0]
We propose a deep-learning system that finds, or indicates the lack of, a correct answer to a question in a context paragraph.
Our goal is to learn an ensemble of heterogeneous SQuAD2.0 models that outperforms the best model in the ensemble per se.
arXiv Detail & Related papers (2020-04-02T08:09:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.