Constraint 2021: Machine Learning Models for COVID-19 Fake News
Detection Shared Task
- URL: http://arxiv.org/abs/2101.03717v2
- Date: Wed, 13 Jan 2021 00:06:45 GMT
- Title: Constraint 2021: Machine Learning Models for COVID-19 Fake News
Detection Shared Task
- Authors: Thomas Felber
- Abstract summary: We address the challenge of classifying COVID-19 related social media posts as either fake or real.
In our system, we address this challenge by applying classical machine learning algorithms together with several linguistic features.
We find our best performing system to be based on a linear SVM, which obtains a weighted average F1 score of 95.19% on test data.
- Score: 0.7614628596146599
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this system paper we present our contribution to the Constraint 2021
COVID-19 Fake News Detection Shared Task, which poses the challenge of
classifying COVID-19 related social media posts as either fake or real. In our
system, we address this challenge by applying classical machine learning
algorithms together with several linguistic features, such as n-grams,
readability, emotional tone and punctuation. In terms of pre-processing, we
experiment with various steps like stop word removal, stemming/lemmatization,
link removal and more. We find our best performing system to be based on a
linear SVM, which obtains a weighted average F1 score of 95.19% on test data,
which lands a place in the middle of the leaderboard (place 80 of 167).
Related papers
- Mavericks at ArAIEval Shared Task: Towards a Safer Digital Space --
Transformer Ensemble Models Tackling Deception and Persuasion [0.0]
We present our approaches for task 1-A and task 2-A of the shared task which focus on persuasion technique detection and disinformation detection respectively.
The tasks use multigenre snippets of tweets and news articles for the given binary classification problem.
We achieved a micro F1-score of 0.742 on task 1-A (8th rank on the leaderboard) and 0.901 on task 2-A (7th rank on the leaderboard) respectively.
arXiv Detail & Related papers (2023-11-30T17:26:57Z) - Unify word-level and span-level tasks: NJUNLP's Participation for the
WMT2023 Quality Estimation Shared Task [59.46906545506715]
We introduce the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task.
Our team submitted predictions for the English-German language pair on all two sub-tasks.
Our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks.
arXiv Detail & Related papers (2023-09-23T01:52:14Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning [108.41464483878683]
We study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks.
We develop an interactive and flexible imitation learning system that can learn from both demonstrations and interventions.
When scaling data collection on a real robot to more than 100 distinct tasks, we find that this system can perform 24 unseen manipulation tasks with an average success rate of 44%.
arXiv Detail & Related papers (2022-02-04T07:30:48Z) - Sequence-level self-learning with multiple hypotheses [53.04725240411895]
We develop new self-learning techniques with an attention-based sequence-to-sequence (seq2seq) model for automatic speech recognition (ASR)
In contrast to conventional unsupervised learning approaches, we adopt the emphmulti-task learning (MTL) framework.
Our experiment results show that our method can reduce the WER on the British speech data from 14.55% to 10.36% compared to the baseline model trained with the US English data only.
arXiv Detail & Related papers (2021-12-10T20:47:58Z) - Checkovid: A COVID-19 misinformation detection system on Twitter using
network and content mining perspectives [9.69596041242667]
During the COVID-19 pandemic, social media platforms were ideal for communicating due to social isolation and quarantine.
To tackle this problem, we present two COVID-19 related misinformation datasets on Twitter.
We propose a misinformation detection system comprising network-based and content-based processes based on machine learning algorithms and NLP techniques.
arXiv Detail & Related papers (2021-07-20T20:58:23Z) - Two-Stream Consensus Network: Submission to HACS Challenge 2021
Weakly-Supervised Learning Track [78.64815984927425]
The goal of weakly-supervised temporal action localization is to temporally locate and classify action of interest in untrimmed videos.
We adopt the two-stream consensus network (TSCN) as the main framework in this challenge.
Our solution ranked 2rd in this challenge, and we hope our method can serve as a baseline for future academic research.
arXiv Detail & Related papers (2021-06-21T03:36:36Z) - A Heuristic-driven Ensemble Framework for COVID-19 Fake News Detection [5.979726271522835]
We describe our Fake News Detection system that automatically identifies whether a tweet related to COVID-19 is "real" or "fake"
We have used an ensemble model consisting of pre-trained models that has helped us achieve a joint 8th position on the leader board.
We have been able to drastically improve our system by incorporating a novel algorithm based on username handles and link domains in tweets fetching an F1-score of 0.9883.
arXiv Detail & Related papers (2021-01-10T13:21:08Z) - Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News
Detection in English [30.61407811064534]
We describe our system for the AAAI 2021 shared task of COVID-19 Fake News Detection in English.
We proposed an ensemble method of different pre-trained language models including BERT, Roberta, Ernie, etc.
We also conduct an extensive analysis of the samples that are not correctly classified.
arXiv Detail & Related papers (2021-01-07T04:01:13Z) - Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID
Twitter BERT and Bagging Ensemble Technique based on Plurality Voting [0.0]
We develop a system that automatically identifies whether an English Tweet related to the novel coronavirus (COVID-19) is informative or not.
Our final approach achieved an F1-score of 0.9037 and we were ranked sixth overall with F1-score as the evaluation criteria.
arXiv Detail & Related papers (2020-10-01T10:54:54Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.