Legend at ArAIEval Shared Task: Persuasion Technique Detection using a
Language-Agnostic Text Representation Model
- URL: http://arxiv.org/abs/2310.09661v1
- Date: Sat, 14 Oct 2023 20:27:04 GMT
- Title: Legend at ArAIEval Shared Task: Persuasion Technique Detection using a
Language-Agnostic Text Representation Model
- Authors: Olumide E. Ojo, Olaronke O. Adebanji, Hiram Calvo, Damian O. Dieke,
Olumuyiwa E. Ojo, Seye E. Akinsanya, Tolulope O. Abiola, Anna Feldman
- Abstract summary: In this paper, we share our best performing submission to the Arabic AI Tasks Evaluation Challenge (ArAIEval) at ArabicNLP 2023.
Our focus was on Task 1, which involves identifying persuasion techniques in excerpts from tweets and news articles.
The persuasion technique in Arabic texts was detected using a training loop with XLM-RoBERTa, a language-agnostic text representation model.
- Score: 1.3506669466260708
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we share our best performing submission to the Arabic AI Tasks
Evaluation Challenge (ArAIEval) at ArabicNLP 2023. Our focus was on Task 1,
which involves identifying persuasion techniques in excerpts from tweets and
news articles. The persuasion technique in Arabic texts was detected using a
training loop with XLM-RoBERTa, a language-agnostic text representation model.
This approach proved to be potent, leveraging fine-tuning of a multilingual
language model. In our evaluation of the test set, we achieved a micro F1 score
of 0.64 for subtask A of the competition.
Related papers
- Nullpointer at ArAIEval Shared Task: Arabic Propagandist Technique Detection with Token-to-Word Mapping in Sequence Tagging [0.0]
This paper investigates the optimization of propaganda technique detection in Arabic text, including tweets & news paragraphs, from ArAIEval shared task 1.
Experimental results show relying on the first token of the word for technique prediction produces the best performance.
arXiv Detail & Related papers (2024-07-01T15:15:24Z) - Mavericks at ArAIEval Shared Task: Towards a Safer Digital Space --
Transformer Ensemble Models Tackling Deception and Persuasion [0.0]
We present our approaches for task 1-A and task 2-A of the shared task which focus on persuasion technique detection and disinformation detection respectively.
The tasks use multigenre snippets of tweets and news articles for the given binary classification problem.
We achieved a micro F1-score of 0.742 on task 1-A (8th rank on the leaderboard) and 0.901 on task 2-A (7th rank on the leaderboard) respectively.
arXiv Detail & Related papers (2023-11-30T17:26:57Z) - Strategies for improving low resource speech to text translation relying
on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST)
We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - Code-Switching Text Generation and Injection in Mandarin-English ASR [57.57570417273262]
We investigate text generation and injection for improving the performance of an industry commonly-used streaming model, Transformer-Transducer (T-T)
We first propose a strategy to generate code-switching text data and then investigate injecting generated text into T-T model explicitly by Text-To-Speech (TTS) conversion or implicitly by tying speech and text latent spaces.
Experimental results on the T-T model trained with a dataset containing 1,800 hours of real Mandarin-English code-switched speech show that our approaches to inject generated code-switching text significantly boost the performance of T-T models.
arXiv Detail & Related papers (2023-03-20T09:13:27Z) - Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts.
A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared.
The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z) - UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social
Media Texts using Transformers and Multi-Task Learning [1.7196613099537055]
We describe the systems developed by our team for SemEval-2020 Task 9.
We aim to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
Our approach achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850.
For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants.
arXiv Detail & Related papers (2020-09-06T17:19:18Z) - Multi-Dialect Arabic BERT for Country-Level Dialect Identification [1.2928709656541642]
We present the experiments conducted, and the models developed by our competing team, Mawdoo3 AI.
The dialect identification subtask provides 21,000 country-level labeled tweets covering all 21 Arab countries.
We publicly release the pre-trained language model component of our winning solution under the name of Multi-dialect-Arabic-BERT model.
arXiv Detail & Related papers (2020-07-10T21:11:46Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.