Related papers: M-DAIGT: A Shared Task on Multi-Domain Detection of AI-Generated Text

M-DAIGT: A Shared Task on Multi-Domain Detection of AI-Generated Text

URL: http://arxiv.org/abs/2511.11340v1
Date: Fri, 14 Nov 2025 14:26:31 GMT
Title: M-DAIGT: A Shared Task on Multi-Domain Detection of AI-Generated Text
Authors: Salima Lamsiyah, Saad Ezzini, Abdelkader El Mahdaouy, Hamza Alami, Abdessamad Benlahbib, Samir El Amrany, Salmane Chafik, Hicham Hammouchi,
Abstract summary: We introduce the Multi-Domain Detection of AI-Generated Text (M-DAIGT) shared task.<n>M-DAIGT comprises two binary classification subtasks: News Article Detection (NAD) and Academic Writing Detection (AWD)<n>A total of 46 unique teams registered for the shared task, of which four teams submitted final results.
Score: 3.91352287996586
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The generation of highly fluent text by Large Language Models (LLMs) poses a significant challenge to information integrity and academic research. In this paper, we introduce the Multi-Domain Detection of AI-Generated Text (M-DAIGT) shared task, which focuses on detecting AI-generated text across multiple domains, particularly in news articles and academic writing. M-DAIGT comprises two binary classification subtasks: News Article Detection (NAD) (Subtask 1) and Academic Writing Detection (AWD) (Subtask 2). To support this task, we developed and released a new large-scale benchmark dataset of 30,000 samples, balanced between human-written and AI-generated texts. The AI-generated content was produced using a variety of modern LLMs (e.g., GPT-4, Claude) and diverse prompting strategies. A total of 46 unique teams registered for the shared task, of which four teams submitted final results. All four teams participated in both Subtask 1 and Subtask 2. We describe the methods employed by these participating teams and briefly discuss future directions for M-DAIGT.

Related papers

AI-generated Text Detection: A Multifaceted Approach to Binary and Multiclass Classification [0.13392361199400257]
Large Language Models (LLMs) have demonstrated remarkable capabilities in generating text that closely resembles human writing.<n>Such capabilities are prone to potential misuse, such as fake news generation, spam email creation, and misuse in academic assignments.<n>We propose two neural architectures: an optimized model and a simpler variant.<n>For Task A, the optimized neural architecture achieved fifth place with $F1$ score of 0.994, and for Task B, the simpler neural architecture also ranked fifth place with $F1$ score of 0.627.
arXiv Detail & Related papers (2025-05-15T09:28:06Z)
Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models [0.0]
This paper presents an effective approach to detect AI-generated text.<n>It was developed for the Defactify 4.0 shared task at the fourth workshop on multimodal fact checking and hate speech detection.<n>Our team (Sarang) achieved the 1st place in both tasks with F1 scores of 1.0 and 0.9531, respectively.
arXiv Detail & Related papers (2025-02-24T05:32:00Z)
GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025.<n>The task consists of two subtasks: Monolingual (English) and Multilingual.<n>We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z)
GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge [71.69373986176839]
We aim to answer whether models can detect generated text from a large, yet fixed, number of domains and LLMs.<n>Over the course of three months, our task was attempted by 9 teams with 23 detector submissions.<n>We find that multiple participants were able to obtain accuracies of over 99% on machine-generated text from RAID while maintaining a 5% False Positive Rate.
arXiv Detail & Related papers (2025-01-15T16:21:09Z)
Advacheck at GenAI Detection Task 1: AI Detection Powered by Domain-Aware Multi-Tasking [0.0]
The paper describes a system designed by Advacheck team to recognise machine-generated and human-written texts in the monolingual subtask of GenAI Detection Task 1 competition. Our developed system is a multi-task architecture with shared Transformer between several classification heads.
arXiv Detail & Related papers (2024-11-18T17:03:30Z)
GigaCheck: Detecting LLM-generated Content [72.27323884094953]
In this work, we investigate the task of generated text detection by proposing the GigaCheck. Our research explores two approaches: (i) distinguishing human-written texts from LLM-generated ones, and (ii) detecting LLM-generated intervals in Human-Machine collaborative texts. Specifically, we use a fine-tuned general-purpose LLM in conjunction with a DETR-like detection model, adapted from computer vision, to localize AI-generated intervals within text.
arXiv Detail & Related papers (2024-10-31T08:30:55Z)
SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z)
Multitask Multimodal Prompted Training for Interactive Embodied Task Completion [48.69347134411864]
Embodied MultiModal Agent (EMMA) is a unified encoder-decoder model that reasons over images and trajectories. By unifying all tasks as text generation, EMMA learns a language of actions which facilitates transfer across tasks.
arXiv Detail & Related papers (2023-11-07T15:27:52Z)
Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian [6.9244605050142995]
We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022. The dataset includes texts from 14 text generators, i.e., one human writer and 13 text generative models fine-tuned for one or more of the following generation tasks. The human-written texts are collected from publicly available resources across multiple domains.
arXiv Detail & Related papers (2022-06-03T14:12:33Z)
FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue [70.65782786401257]
This work explores conversational task transfer by introducing FETA: a benchmark for few-sample task transfer in open-domain dialogue. FETA contains two underlying sets of conversations upon which there are 10 and 7 tasks annotated, enabling the study of intra-dataset task transfer. We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs.
arXiv Detail & Related papers (2022-05-12T17:59:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.