NIT COVID-19 at WNUT-2020 Task 2: Deep Learning Model RoBERTa for
Identify Informative COVID-19 English Tweets
- URL: http://arxiv.org/abs/2011.05551v1
- Date: Wed, 11 Nov 2020 05:20:39 GMT
- Title: NIT COVID-19 at WNUT-2020 Task 2: Deep Learning Model RoBERTa for
Identify Informative COVID-19 English Tweets
- Authors: Jagadeesh M S, Alphonse P J A
- Abstract summary: This paper presents the model submitted by the NIT_COVID-19 team for identified informative COVID-19 English tweets at WNUT-2020 Task2.
The performance achieved by the proposed model for shared task WNUT 2020 Task2 is 89.14% in the F1-score metric.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the model submitted by the NIT_COVID-19 team for
identified informative COVID-19 English tweets at WNUT-2020 Task2. This shared
task addresses the problem of automatically identifying whether an English
tweet related to informative (novel coronavirus) or not. These informative
tweets provide information about recovered, confirmed, suspected, and death
cases as well as the location or travel history of the cases. The proposed
approach includes pre-processing techniques and pre-trained RoBERTa with
suitable hyperparameters for English coronavirus tweet classification. The
performance achieved by the proposed model for shared task WNUT 2020 Task2 is
89.14% in the F1-score metric.
Related papers
- ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - Annotation Curricula to Implicitly Train Non-Expert Annotators [56.67768938052715]
voluntary studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.
This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations.
We propose annotation curricula, a novel approach to implicitly train annotators.
arXiv Detail & Related papers (2021-06-04T09:48:28Z) - Not-NUTs at W-NUT 2020 Task 2: A BERT-based System in Identifying
Informative COVID-19 English Tweets [0.0]
We propose a model that, given an English tweet, automatically identifies whether that tweet bears informative content regarding COVID-19 or not.
We have achieved competitive results that are only shy of those by top performing teams by roughly 1% in terms of F1 score on the informative class.
arXiv Detail & Related papers (2020-09-14T15:49:16Z) - CIA_NITT at WNUT-2020 Task 2: Classification of COVID-19 Tweets Using
Pre-trained Language Models [0.0]
We treat this as binary text classification problem and experiment with pre-trained language models.
Our first model which is based on CT-BERT achieves F1-score of 88.7% and second model which is ensemble of CT-BERT, RoBERTa and SVM achieves F1-score of 88.52%.
arXiv Detail & Related papers (2020-09-12T12:59:54Z) - LynyrdSkynyrd at WNUT-2020 Task 2: Semi-Supervised Learning for
Identification of Informative COVID-19 English Tweets [4.361526134899725]
We describe our system for WNUT-2020 shared task on the identification of informative COVID-19 English tweets.
Our system is an ensemble of various machine learning methods, leveraging both traditional feature-based classifiers as well as recent advances in pre-trained language models.
Our best performing model achieves an F1-score of 0.9179 on the provided validation set and 0.8805 on the blind test-set.
arXiv Detail & Related papers (2020-09-08T16:29:25Z) - UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19
Information on the Twitter Social Network [2.7528170226206443]
In this paper, we present our results at the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets.
We propose our simple but effective approach using the transformer-based models based on COVID-Twitter-BERT (CT-BERT) with different fine-tuning techniques.
As a result, we achieve the F1-Score of 90.94% with the third place on the leaderboard of this task which attracted 56 submitted teams in total.
arXiv Detail & Related papers (2020-09-07T08:20:31Z) - BANANA at WNUT-2020 Task 2: Identifying COVID-19 Information on Twitter
by Combining Deep Learning and Transfer Learning Models [0.0]
This paper describes our prediction system for WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets.
The dataset for this task contains size 10,000 tweets in English labeled by humans.
The experimental result indicates that we have achieved F1 for the INFORMATIVE label on our systems at 88.81% on the test set.
arXiv Detail & Related papers (2020-09-06T08:24:55Z) - CO-Search: COVID-19 Information Retrieval with Semantic Search, Question
Answering, and Abstractive Summarization [53.67205506042232]
CO-Search is a retriever-ranker semantic search engine designed to handle complex queries over the COVID-19 literature.
To account for the domain-specific and relatively limited dataset, we generate a bipartite graph of document paragraphs and citations.
We evaluate our system on the data of the TREC-COVID information retrieval challenge.
arXiv Detail & Related papers (2020-06-17T01:32:48Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.