IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry
- URL: http://arxiv.org/abs/2501.05476v1
- Date: Tue, 07 Jan 2025 10:19:56 GMT
- Title: IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry
- Authors: Mohammad AL-Smadi,
- Abstract summary: This research utilizes pre-trained, transformer-based models fine-tuned on Arabic and English academic essays with stylometric features.
Proposed models achieved excellent results with an F1-score of 99.7%, ranking 2nd among 26 teams in the English subtask, and 98.4%, finishing 1st out of 23 teams in the Arabic one.
- Score: 0.21756081703275998
- License:
- Abstract: Recent research has investigated the problem of detecting machine-generated essays for academic purposes. To address this challenge, this research utilizes pre-trained, transformer-based models fine-tuned on Arabic and English academic essays with stylometric features. Custom models based on ELECTRA for English and AraELECTRA for Arabic were trained and evaluated using a benchmark dataset. Proposed models achieved excellent results with an F1-score of 99.7%, ranking 2nd among of 26 teams in the English subtask, and 98.4%, finishing 1st out of 23 teams in the Arabic one.
Related papers
- GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025.
The task consists of two subtasks: Monolingual (English) and Multilingual.
We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z) - GenAI Content Detection Task 2: AI vs. Human -- Academic Essay Authenticity Challenge [12.076440946525434]
The Academic Essay Authenticity Challenge was organized as part of the GenAI Content Detection shared tasks collocated with COLING 2025.
This challenge focuses on detecting machine-generated vs. human-authored essays for academic purposes.
The challenge involves two languages: English and Arabic.
This paper outlines the task formulation, details the dataset construction process, and explains the evaluation framework.
arXiv Detail & Related papers (2024-12-24T08:33:44Z) - HYBRINFOX at CheckThat! 2024 -- Task 1: Enhancing Language Models with Structured Information for Check-Worthiness Estimation [0.8083061106940517]
This paper summarizes the experiments and results of the HYBRINFOX team for the CheckThat! 2024 - Task 1 competition.
We propose an approach enriching Language Models such as RoBERTa with embeddings produced by triples.
arXiv Detail & Related papers (2024-07-04T11:33:54Z) - ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic [51.922112625469836]
We present datasetname, the first multi-task language understanding benchmark for the Arabic language.
Our data comprises 40 tasks and 14,575 multiple-choice questions in Modern Standard Arabic (MSA) and is carefully constructed by collaborating with native speakers in the region.
Our evaluations of 35 models reveal substantial room for improvement, particularly among the best open-source models.
arXiv Detail & Related papers (2024-02-20T09:07:41Z) - Legend at ArAIEval Shared Task: Persuasion Technique Detection using a
Language-Agnostic Text Representation Model [1.3506669466260708]
In this paper, we share our best performing submission to the Arabic AI Tasks Evaluation Challenge (ArAIEval) at ArabicNLP 2023.
Our focus was on Task 1, which involves identifying persuasion techniques in excerpts from tweets and news articles.
The persuasion technique in Arabic texts was detected using a training loop with XLM-RoBERTa, a language-agnostic text representation model.
arXiv Detail & Related papers (2023-10-14T20:27:04Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - The Effect of Normalization for Bi-directional Amharic-English Neural
Machine Translation [53.907805815477126]
This paper presents the first relatively large-scale Amharic-English parallel sentence dataset.
We build bi-directional Amharic-English translation models by fine-tuning the existing Facebook M2M100 pre-trained model.
The results show that the normalization of Amharic homophone characters increases the performance of Amharic-English machine translation in both directions.
arXiv Detail & Related papers (2022-10-27T07:18:53Z) - RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts.
A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared.
The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z) - Pre-trained Transformer-Based Approach for Arabic Question Answering : A
Comparative Study [0.5801044612920815]
We evaluate the state-of-the-art pre-trained transformers models for Arabic QA using four reading comprehension datasets.
We fine-tuned and compared the performance of the AraBERTv2-base model, AraBERTv0.2-large model, and AraELECTRA model.
arXiv Detail & Related papers (2021-11-10T12:33:18Z) - AraGPT2: Pre-Trained Transformer for Arabic Language Generation [0.0]
We develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles.
Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available.
For text generation, our best model achieves a perplexity of 29.8 on held-out Wikipedia articles.
arXiv Detail & Related papers (2020-12-31T09:48:05Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.