BANANA at WNUT-2020 Task 2: Identifying COVID-19 Information on Twitter
by Combining Deep Learning and Transfer Learning Models
- URL: http://arxiv.org/abs/2009.02671v2
- Date: Thu, 1 Apr 2021 06:21:07 GMT
- Title: BANANA at WNUT-2020 Task 2: Identifying COVID-19 Information on Twitter
by Combining Deep Learning and Transfer Learning Models
- Authors: Tin Van Huynh, Luan Thanh Nguyen and Son T. Luu
- Abstract summary: This paper describes our prediction system for WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets.
The dataset for this task contains size 10,000 tweets in English labeled by humans.
The experimental result indicates that we have achieved F1 for the INFORMATIVE label on our systems at 88.81% on the test set.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The outbreak COVID-19 virus caused a significant impact on the health of
people all over the world. Therefore, it is essential to have a piece of
constant and accurate information about the disease with everyone. This paper
describes our prediction system for WNUT-2020 Task 2: Identification of
Informative COVID-19 English Tweets. The dataset for this task contains size
10,000 tweets in English labeled by humans. The ensemble model from our three
transformer and deep learning models is used for the final prediction. The
experimental result indicates that we have achieved F1 for the INFORMATIVE
label on our systems at 88.81% on the test set.
Related papers
- ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - Human Behavior in the Time of COVID-19: Learning from Big Data [71.26355067309193]
Since March 2020, there have been over 600 million confirmed cases of COVID-19 and more than six million deaths.
The pandemic has impacted and even changed human behavior in almost every aspect.
Researchers have been employing big data techniques such as natural language processing, computer vision, audio signal processing, frequent pattern mining, and machine learning.
arXiv Detail & Related papers (2023-03-23T17:19:26Z) - Two-Stage Classifier for COVID-19 Misinformation Detection Using BERT: a
Study on Indonesian Tweets [0.15229257192293202]
Research on COVID-19 misinformation detection in Indonesia is still scarce.
In this study, we propose the two-stage classifier model using IndoBERT pre-trained language model for the Tweet misinformation detection task.
The experimental results show that the combination of the BERT sequence classifier for relevance prediction and Bi-LSTM for misinformation detection outperformed other machine learning models with an accuracy of 87.02%.
arXiv Detail & Related papers (2022-06-30T15:33:20Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Misleading the Covid-19 vaccination discourse on Twitter: An exploratory
study of infodemic around the pandemic [0.45593531937154413]
We collect a moderate-sized representative corpus of tweets (200,000 approx.) pertaining to Covid-19 vaccination over a period of seven months (September 2020 - March 2021)
Following a Transfer Learning approach, we utilize the pre-trained Transformer-based XLNet model to classify tweets as Misleading or Non-Misleading.
We build on this to study and contrast the characteristics of tweets in the corpus that are misleading in nature against non-misleading ones.
Several ML models are employed for prediction, with up to 90% accuracy, and the importance of each feature is explained using SHAP Explainable AI (X
arXiv Detail & Related papers (2021-08-16T17:02:18Z) - NIT COVID-19 at WNUT-2020 Task 2: Deep Learning Model RoBERTa for
Identify Informative COVID-19 English Tweets [0.0]
This paper presents the model submitted by the NIT_COVID-19 team for identified informative COVID-19 English tweets at WNUT-2020 Task2.
The performance achieved by the proposed model for shared task WNUT 2020 Task2 is 89.14% in the F1-score metric.
arXiv Detail & Related papers (2020-11-11T05:20:39Z) - Not-NUTs at W-NUT 2020 Task 2: A BERT-based System in Identifying
Informative COVID-19 English Tweets [0.0]
We propose a model that, given an English tweet, automatically identifies whether that tweet bears informative content regarding COVID-19 or not.
We have achieved competitive results that are only shy of those by top performing teams by roughly 1% in terms of F1 score on the informative class.
arXiv Detail & Related papers (2020-09-14T15:49:16Z) - LynyrdSkynyrd at WNUT-2020 Task 2: Semi-Supervised Learning for
Identification of Informative COVID-19 English Tweets [4.361526134899725]
We describe our system for WNUT-2020 shared task on the identification of informative COVID-19 English tweets.
Our system is an ensemble of various machine learning methods, leveraging both traditional feature-based classifiers as well as recent advances in pre-trained language models.
Our best performing model achieves an F1-score of 0.9179 on the provided validation set and 0.8805 on the blind test-set.
arXiv Detail & Related papers (2020-09-08T16:29:25Z) - TICO-19: the Translation Initiative for Covid-19 [112.5601530395345]
The Translation Initiative for COvid-19 (TICO-19) has made test and development data available to AI and MT researchers in 35 different languages.
The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set.
arXiv Detail & Related papers (2020-07-03T16:26:17Z) - CO-Search: COVID-19 Information Retrieval with Semantic Search, Question
Answering, and Abstractive Summarization [53.67205506042232]
CO-Search is a retriever-ranker semantic search engine designed to handle complex queries over the COVID-19 literature.
To account for the domain-specific and relatively limited dataset, we generate a bipartite graph of document paragraphs and citations.
We evaluate our system on the data of the TREC-COVID information retrieval challenge.
arXiv Detail & Related papers (2020-06-17T01:32:48Z) - Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [90.12602012910465]
We train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries.
Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
arXiv Detail & Related papers (2020-06-05T02:04:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.