Related papers: BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection

BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection

URL: http://arxiv.org/abs/2510.20610v2
Date: Sat, 25 Oct 2025 15:33:29 GMT
Title: BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection
Authors: Ali Zain, Sareem Farooqui, Muhammad Rafi,
Abstract summary: We investigated the effectiveness of three pre-trained transformer models: AraELECTRA, CAMeLBERT, and XLM-RoBERTa.<n>Our approach involved fine-tuning each model on the provided dataset for a binary classification task.<n>The multilingual XLM-RoBERTa model achieved the highest performance with an F1 score of 0.7701, outperforming the specialized Arabic models.
Score: 0.5735035463793009
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper details our submission to the AraGenEval Shared Task on Arabic AI-generated text detection, where our team, BUSTED, secured 5th place. We investigated the effectiveness of three pre-trained transformer models: AraELECTRA, CAMeLBERT, and XLM-RoBERTa. Our approach involved fine-tuning each model on the provided dataset for a binary classification task. Our findings revealed a surprising result: the multilingual XLM-RoBERTa model achieved the highest performance with an F1 score of 0.7701, outperforming the specialized Arabic models. This work underscores the complexities of AI-generated text detection and highlights the strong generalization capabilities of multilingual models.

Related papers

Assessing Classical Machine Learning and Transformer-based Approaches for Detecting AI-Generated Research Text [0.0]
Machine learning approaches can distinguish ChatGPT-3.5-generated texts from human-written texts.<n>DistilBERT achieves the overall best performance, while Logistic Regression and BERT-Custom offer solid, balanced alternatives.
arXiv Detail & Related papers (2025-09-20T04:36:21Z)
Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models [0.0]
This paper presents an effective approach to detect AI-generated text.<n>It was developed for the Defactify 4.0 shared task at the fourth workshop on multimodal fact checking and hate speech detection.<n>Our team (Sarang) achieved the 1st place in both tasks with F1 scores of 1.0 and 0.9531, respectively.
arXiv Detail & Related papers (2025-02-24T05:32:00Z)
LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts [0.8495482945981923]
This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content.<n>Our approach utilizes an ensemble of models, with weights assigned according to each model's inverse perplexity, to enhance classification accuracy.<n>Our results demonstrate the effectiveness of inverse perplexity weighting in improving the robustness of machine-generated text detection across both monolingual and multilingual settings.
arXiv Detail & Related papers (2025-01-21T06:32:32Z)
CATT: Character-based Arabic Tashkeel Transformer [0.0]
Tashkeel, or Arabic Text Diacritization, greatly enhances the comprehension of Arabic text. This paper introduces a new approach to training ATD models. We evaluate our models alongside 11 commercial and open-source models.
arXiv Detail & Related papers (2024-07-03T16:05:20Z)
Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models. We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks. OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z)
mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries. We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z)
AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization [23.540743628126837]
We propose AraBART, the first Arabic model in which the encoder and the decoder are pretrained end-to-end, based on BART. We show that AraBART achieves the best performance on multiple abstractive summarization datasets.
arXiv Detail & Related papers (2022-03-21T13:11:41Z)
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation [101.00109827301235]
We introduce a novel paradigm for dataset creation based on human and machine collaboration. We use dataset cartography to automatically identify examples that demonstrate challenging reasoning patterns, and instruct GPT-3 to compose new examples with similar patterns. The resulting dataset, WANLI, consists of 108,357 natural language inference (NLI) examples that present unique empirical strengths.
arXiv Detail & Related papers (2022-01-16T03:13:49Z)
Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data. Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z)
Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language. We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models. Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z)
Exemplar-Controllable Paraphrasing and Translation using Bitext [57.92051459102902]
We adapt models from prior work to be able to learn solely from bilingual text (bitext) Our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions.
arXiv Detail & Related papers (2020-10-12T17:02:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.