Not-NUTs at W-NUT 2020 Task 2: A BERT-based System in Identifying
Informative COVID-19 English Tweets
- URL: http://arxiv.org/abs/2009.06372v1
- Date: Mon, 14 Sep 2020 15:49:16 GMT
- Title: Not-NUTs at W-NUT 2020 Task 2: A BERT-based System in Identifying
Informative COVID-19 English Tweets
- Authors: Thai Quoc Hoang and Phuong Thu Vu
- Abstract summary: We propose a model that, given an English tweet, automatically identifies whether that tweet bears informative content regarding COVID-19 or not.
We have achieved competitive results that are only shy of those by top performing teams by roughly 1% in terms of F1 score on the informative class.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As of 2020 when the COVID-19 pandemic is full-blown on a global scale,
people's need to have access to legitimate information regarding COVID-19 is
more urgent than ever, especially via online media where the abundance of
irrelevant information overshadows the more informative ones. In response to
such, we proposed a model that, given an English tweet, automatically
identifies whether that tweet bears informative content regarding COVID-19 or
not. By ensembling different BERTweet model configurations, we have achieved
competitive results that are only shy of those by top performing teams by
roughly 1% in terms of F1 score on the informative class. In the
post-competition period, we have also experimented with various other
approaches that potentially boost generalization to a new dataset.
Related papers
- BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021 [55.41644538483948]
The goal of the shared task is to motivate the community to come up with efficient methods for solving this vital problem.
The training set contains 1300 annotated news articles -- 750 real news, 550 fake news, while the testing set contains 300 news articles -- 200 real, 100 fake news.
The best performing system obtained an F1-macro score of 0.679, which is lower than the past year's best result of 0.907 F1-macro.
arXiv Detail & Related papers (2022-07-11T18:58:36Z) - Cross-lingual COVID-19 Fake News Detection [54.125563009333995]
We make the first attempt to detect COVID-19 misinformation in a low-resource language (Chinese) only using the fact-checked news in a high-resource language (English)
We propose a deep learning framework named CrossFake to jointly encode the cross-lingual news body texts and capture the news content.
Empirical results on our dataset demonstrate the effectiveness of CrossFake under the cross-lingual setting.
arXiv Detail & Related papers (2021-10-13T04:44:02Z) - Changes in European Solidarity Before and During COVID-19: Evidence from
a Large Crowd- and Expert-Annotated Twitter Dataset [77.27709662210363]
We introduce the well-established social scientific concept of social solidarity and its contestation, anti-solidarity, as a new problem setting to supervised machine learning in NLP.
We annotate 2.3k English and German tweets for (anti-)solidarity expressions, utilizing multiple human annotators and two annotation approaches (experts vs. crowds)
Our results show that solidarity became increasingly salient and contested during the COVID-19 crisis.
arXiv Detail & Related papers (2021-08-02T17:03:12Z) - NIT COVID-19 at WNUT-2020 Task 2: Deep Learning Model RoBERTa for
Identify Informative COVID-19 English Tweets [0.0]
This paper presents the model submitted by the NIT_COVID-19 team for identified informative COVID-19 English tweets at WNUT-2020 Task2.
The performance achieved by the proposed model for shared task WNUT 2020 Task2 is 89.14% in the F1-score metric.
arXiv Detail & Related papers (2020-11-11T05:20:39Z) - LynyrdSkynyrd at WNUT-2020 Task 2: Semi-Supervised Learning for
Identification of Informative COVID-19 English Tweets [4.361526134899725]
We describe our system for WNUT-2020 shared task on the identification of informative COVID-19 English tweets.
Our system is an ensemble of various machine learning methods, leveraging both traditional feature-based classifiers as well as recent advances in pre-trained language models.
Our best performing model achieves an F1-score of 0.9179 on the provided validation set and 0.8805 on the blind test-set.
arXiv Detail & Related papers (2020-09-08T16:29:25Z) - UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19
Information on the Twitter Social Network [2.7528170226206443]
In this paper, we present our results at the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets.
We propose our simple but effective approach using the transformer-based models based on COVID-Twitter-BERT (CT-BERT) with different fine-tuning techniques.
As a result, we achieve the F1-Score of 90.94% with the third place on the leaderboard of this task which attracted 56 submitted teams in total.
arXiv Detail & Related papers (2020-09-07T08:20:31Z) - BANANA at WNUT-2020 Task 2: Identifying COVID-19 Information on Twitter
by Combining Deep Learning and Transfer Learning Models [0.0]
This paper describes our prediction system for WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets.
The dataset for this task contains size 10,000 tweets in English labeled by humans.
The experimental result indicates that we have achieved F1 for the INFORMATIVE label on our systems at 88.81% on the test set.
arXiv Detail & Related papers (2020-09-06T08:24:55Z) - TATL at W-NUT 2020 Task 2: A Transformer-based Baseline System for
Identification of Informative COVID-19 English Tweets [1.4315501760755605]
We present our participation in the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets.
Inspired by the recent advances in pretrained Transformer language models, we propose a simple yet effective baseline for the task.
Despite its simplicity, our proposed approach shows very competitive results in the leaderboard.
arXiv Detail & Related papers (2020-08-28T21:27:42Z) - A System for Worldwide COVID-19 Information Aggregation [92.60866520230803]
We build a system for worldwide COVID-19 information aggregation containing reliable articles from 10 regions in 7 languages sorted by topics.
A neural machine translation module translates articles in other languages into Japanese and English.
A BERT-based topic-classifier trained on our article-topic pair dataset helps users find their interested information efficiently.
arXiv Detail & Related papers (2020-07-28T01:33:54Z) - Misinformation Has High Perplexity [55.47422012881148]
We propose to leverage the perplexity to debunk false claims in an unsupervised manner.
First, we extract reliable evidence from scientific and news sources according to sentence similarity to the claims.
Second, we prime a language model with the extracted evidence and finally evaluate the correctness of given claims based on the perplexity scores at debunking time.
arXiv Detail & Related papers (2020-06-08T15:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.