Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News
Detection in English
- URL: http://arxiv.org/abs/2101.02359v1
- Date: Thu, 7 Jan 2021 04:01:13 GMT
- Title: Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News
Detection in English
- Authors: Xiangyang Li, Yu Xia, Xiang Long, Zheng Li, Sujian Li
- Abstract summary: We describe our system for the AAAI 2021 shared task of COVID-19 Fake News Detection in English.
We proposed an ensemble method of different pre-trained language models including BERT, Roberta, Ernie, etc.
We also conduct an extensive analysis of the samples that are not correctly classified.
- Score: 30.61407811064534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we describe our system for the AAAI 2021 shared task of
COVID-19 Fake News Detection in English, where we achieved the 3rd position
with the weighted F1 score of 0.9859 on the test set. Specifically, we proposed
an ensemble method of different pre-trained language models such as BERT,
Roberta, Ernie, etc. with various training strategies including
warm-up,learning rate schedule and k-fold cross-validation. We also conduct an
extensive analysis of the samples that are not correctly classified. The code
is available
at:https://github.com/archersama/3rd-solution-COVID19-Fake-News-Detection-in-English.
Related papers
- Unify word-level and span-level tasks: NJUNLP's Participation for the
WMT2023 Quality Estimation Shared Task [59.46906545506715]
We introduce the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task.
Our team submitted predictions for the English-German language pair on all two sub-tasks.
Our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks.
arXiv Detail & Related papers (2023-09-23T01:52:14Z) - Strategies for improving low resource speech to text translation relying
on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST)
We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake
News Detection [7.29381091750894]
We propose a novel transformer-based language model fine-tuning approach for these fake news detection.
First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases.
Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations.
arXiv Detail & Related papers (2021-01-14T09:05:42Z) - LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT [0.0]
We propose a Layer Differentiated training procedure for training a pre-trained ULMFiT arXiv:1801.06146 model.
We used special tokens to annotate specific parts of the tweets to improve language understanding and gain insights on the model.
The proposed approach ranked 61st out of 164 in the sub-task "COVID19 Fake News Detection in English"
arXiv Detail & Related papers (2021-01-13T09:52:04Z) - Constraint 2021: Machine Learning Models for COVID-19 Fake News
Detection Shared Task [0.7614628596146599]
We address the challenge of classifying COVID-19 related social media posts as either fake or real.
In our system, we address this challenge by applying classical machine learning algorithms together with several linguistic features.
We find our best performing system to be based on a linear SVM, which obtains a weighted average F1 score of 95.19% on test data.
arXiv Detail & Related papers (2021-01-11T05:57:32Z) - Facebook AI's WMT20 News Translation Task Submission [69.92594751788403]
This paper describes Facebook AI's submission to WMT20 shared news translation task.
We focus on the low resource setting and participate in two language pairs, Tamil -> English and Inuktitut -> English.
We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.
arXiv Detail & Related papers (2020-11-16T21:49:00Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID
Twitter BERT and Bagging Ensemble Technique based on Plurality Voting [0.0]
We develop a system that automatically identifies whether an English Tweet related to the novel coronavirus (COVID-19) is informative or not.
Our final approach achieved an F1-score of 0.9037 and we were ranked sixth overall with F1-score as the evaluation criteria.
arXiv Detail & Related papers (2020-10-01T10:54:54Z) - LynyrdSkynyrd at WNUT-2020 Task 2: Semi-Supervised Learning for
Identification of Informative COVID-19 English Tweets [4.361526134899725]
We describe our system for WNUT-2020 shared task on the identification of informative COVID-19 English tweets.
Our system is an ensemble of various machine learning methods, leveraging both traditional feature-based classifiers as well as recent advances in pre-trained language models.
Our best performing model achieves an F1-score of 0.9179 on the provided validation set and 0.8805 on the blind test-set.
arXiv Detail & Related papers (2020-09-08T16:29:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.