Mavericks at ArAIEval Shared Task: Towards a Safer Digital Space --
Transformer Ensemble Models Tackling Deception and Persuasion
- URL: http://arxiv.org/abs/2311.18730v1
- Date: Thu, 30 Nov 2023 17:26:57 GMT
- Title: Mavericks at ArAIEval Shared Task: Towards a Safer Digital Space --
Transformer Ensemble Models Tackling Deception and Persuasion
- Authors: Sudeep Mangalvedhekar, Kshitij Deshpande, Yash Patwardhan, Vedant
Deshpande and Ravindra Murumkar
- Abstract summary: We present our approaches for task 1-A and task 2-A of the shared task which focus on persuasion technique detection and disinformation detection respectively.
The tasks use multigenre snippets of tweets and news articles for the given binary classification problem.
We achieved a micro F1-score of 0.742 on task 1-A (8th rank on the leaderboard) and 0.901 on task 2-A (7th rank on the leaderboard) respectively.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we highlight our approach for the "Arabic AI Tasks Evaluation
(ArAiEval) Shared Task 2023". We present our approaches for task 1-A and task
2-A of the shared task which focus on persuasion technique detection and
disinformation detection respectively. Detection of persuasion techniques and
disinformation has become imperative to avoid distortion of authentic
information. The tasks use multigenre snippets of tweets and news articles for
the given binary classification problem. We experiment with several
transformer-based models that were pre-trained on the Arabic language. We
fine-tune these state-of-the-art models on the provided dataset. Ensembling is
employed to enhance the performance of the systems. We achieved a micro
F1-score of 0.742 on task 1-A (8th rank on the leaderboard) and 0.901 on task
2-A (7th rank on the leaderboard) respectively.
Related papers
- Advacheck at GenAI Detection Task 1: AI Detection Powered by Domain-Aware Multi-Tasking [0.0]
The paper describes a system designed by Advacheck team to recognise machine-generated and human-written texts in the monolingual subtask of GenAI Detection Task 1 competition.
Our developed system is a multi-task architecture with shared Transformer between several classification heads.
arXiv Detail & Related papers (2024-11-18T17:03:30Z) - ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection
in Arabic Text [41.3267575540348]
We present an overview of the ArAIEval shared task, organized as part of the first Arabic 2023 conference co-located with EMNLP 2023.
ArAIEval offers two tasks over Arabic text: (i) persuasion technique detection, focusing on identifying persuasion techniques in tweets and news articles, and (ii) disinformation detection in binary and multiclass setups over tweets.
A total of 20 teams participated in the final evaluation phase, with 14 and 16 teams participating in Tasks 1 and 2, respectively.
arXiv Detail & Related papers (2023-11-06T15:21:19Z) - Attention at SemEval-2023 Task 10: Explainable Detection of Online
Sexism (EDOS) [15.52876591707497]
We have worked on interpretability, trust, and understanding of the decisions made by models in the form of classification tasks.
The first task consists of determining Binary Sexism Detection.
The second task describes the Category of Sexism.
The third task describes a more Fine-grained Category of Sexism.
arXiv Detail & Related papers (2023-04-10T14:24:52Z) - Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - Toward Efficient Language Model Pretraining and Downstream Adaptation
via Self-Evolution: A Case Study on SuperGLUE [203.65227947509933]
This report describes our JDExplore d-team's Vega v2 submission on the SuperGLUE leaderboard.
SuperGLUE is more challenging than the widely used general language understanding evaluation (GLUE) benchmark, containing eight difficult language understanding tasks.
arXiv Detail & Related papers (2022-12-04T15:36:18Z) - X-PuDu at SemEval-2022 Task 7: A Replaced Token Detection Task
Pre-trained Model with Pattern-aware Ensembling for Identifying Plausible
Clarifications [13.945286351253717]
This paper describes our winning system on SemEval 2022 Task 7: Identifying Plausible Clarifications of Implicit and Underspecified Phrases in instructional texts.
A replaced token detection pre-trained model is utilized with minorly different task-specific heads for SubTask-A: Multi-class Classification and SubTask-B: Ranking.
Our system achieves a 68.90% accuracy score and 0.8070 spearman's rank correlation score surpassing the 2nd place with a large margin by 2.7 and 2.2 percent points for SubTask-A and SubTask-B, respectively.
arXiv Detail & Related papers (2022-11-27T05:46:46Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - UPB at SemEval-2021 Task 7: Adversarial Multi-Task Learning for
Detecting and Rating Humor and Offense [0.6404122934568858]
We describe our adversarial multi-task network, AMTL-Humor, used to detect and rate humor and offensive texts.
Our best model consists of an ensemble of all tested configurations, and achieves a 95.66% F1-score and 94.70% accuracy for Task 1a.
arXiv Detail & Related papers (2021-04-13T09:59:05Z) - Device-Robust Acoustic Scene Classification Based on Two-Stage
Categorization and Data Augmentation [63.98724740606457]
We present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge.
Task 1a focuses on ASC of audio signals recorded with multiple (real and simulated) devices into ten different fine-grained classes.
Task 1b concerns with classification of data into three higher-level classes using low-complexity solutions.
arXiv Detail & Related papers (2020-07-16T15:07:14Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.