BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection
- URL: http://arxiv.org/abs/2510.20610v2
- Date: Sat, 25 Oct 2025 15:33:29 GMT
- Title: BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection
- Authors: Ali Zain, Sareem Farooqui, Muhammad Rafi,
- Abstract summary: We investigated the effectiveness of three pre-trained transformer models: AraELECTRA, CAMeLBERT, and XLM-RoBERTa.<n>Our approach involved fine-tuning each model on the provided dataset for a binary classification task.<n>The multilingual XLM-RoBERTa model achieved the highest performance with an F1 score of 0.7701, outperforming the specialized Arabic models.
- Score: 0.5735035463793009
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper details our submission to the AraGenEval Shared Task on Arabic AI-generated text detection, where our team, BUSTED, secured 5th place. We investigated the effectiveness of three pre-trained transformer models: AraELECTRA, CAMeLBERT, and XLM-RoBERTa. Our approach involved fine-tuning each model on the provided dataset for a binary classification task. Our findings revealed a surprising result: the multilingual XLM-RoBERTa model achieved the highest performance with an F1 score of 0.7701, outperforming the specialized Arabic models. This work underscores the complexities of AI-generated text detection and highlights the strong generalization capabilities of multilingual models.
Related papers
- Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text [0.0]
Machine learning approaches can distinguish ChatGPT-3.5-generated texts from human-written texts.<n>DistilBERT achieves the overall best performance, while Logistic Regression and BERT-Custom offer solid, balanced alternatives.
arXiv Detail & Related papers (2025-09-20T04:36:21Z) - Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models [0.0]
This paper presents an effective approach to detect AI-generated text.<n>It was developed for the Defactify 4.0 shared task at the fourth workshop on multimodal fact checking and hate speech detection.<n>Our team (Sarang) achieved the 1st place in both tasks with F1 scores of 1.0 and 0.9531, respectively.
arXiv Detail & Related papers (2025-02-24T05:32:00Z) - LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts [0.8495482945981923]
This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content.<n>Our approach utilizes an ensemble of models, with weights assigned according to each model's inverse perplexity, to enhance classification accuracy.<n>Our results demonstrate the effectiveness of inverse perplexity weighting in improving the robustness of machine-generated text detection across both monolingual and multilingual settings.
arXiv Detail & Related papers (2025-01-21T06:32:32Z) - CATT: Character-based Arabic Tashkeel Transformer [0.0]
Tashkeel, or Arabic Text Diacritization, greatly enhances the comprehension of Arabic text.
This paper introduces a new approach to training ATD models.
We evaluate our models alongside 11 commercial and open-source models.
arXiv Detail & Related papers (2024-07-03T16:05:20Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive
Summarization [23.540743628126837]
We propose AraBART, the first Arabic model in which the encoder and the decoder are pretrained end-to-end, based on BART.
We show that AraBART achieves the best performance on multiple abstractive summarization datasets.
arXiv Detail & Related papers (2022-03-21T13:11:41Z) - WANLI: Worker and AI Collaboration for Natural Language Inference
Dataset Creation [101.00109827301235]
We introduce a novel paradigm for dataset creation based on human and machine collaboration.
We use dataset cartography to automatically identify examples that demonstrate challenging reasoning patterns, and instruct GPT-3 to compose new examples with similar patterns.
The resulting dataset, WANLI, consists of 108,357 natural language inference (NLI) examples that present unique empirical strengths.
arXiv Detail & Related papers (2022-01-16T03:13:49Z) - Learning Contextual Representations for Semantic Parsing with
Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.
Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Exemplar-Controllable Paraphrasing and Translation using Bitext [57.92051459102902]
We adapt models from prior work to be able to learn solely from bilingual text (bitext)
Our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions.
arXiv Detail & Related papers (2020-10-12T17:02:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.