GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human
- URL: http://arxiv.org/abs/2501.11012v1
- Date: Sun, 19 Jan 2025 11:11:55 GMT
- Title: GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human
- Authors: Yuxia Wang, Artem Shelmanov, Jonibek Mansurov, Akim Tsvigun, Vladislav Mikhailov, Rui Xing, Zhuohan Xie, Jiahui Geng, Giovanni Puccetti, Ekaterina Artemova, jinyan su, Minh Ngoc Ta, Mervat Abassy, Kareem Ashraf Elozeiri, Saad El Dine Ahmed El Etter, Maiya Goloburda, Tarek Mahmoud, Raj Vardhan Tomar, Nurkhan Laiyk, Osama Mohammed Afzal, Ryuto Koike, Masahiro Kaneko, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov,
- Abstract summary: We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025.
The task consists of two subtasks: Monolingual (English) and Multilingual.
We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
- Score: 71.42669028683741
- License:
- Abstract: We present the GenAI Content Detection Task~1 -- a shared task on binary machine generated text detection, conducted as a part of the GenAI workshop at COLING 2025. The task consists of two subtasks: Monolingual (English) and Multilingual. The shared task attracted many participants: 36 teams made official submissions to the Monolingual subtask during the test phase and 26 teams -- to the Multilingual. We provide a comprehensive overview of the data, a summary of the results -- including system rankings and performance scores -- detailed descriptions of the participating systems, and an in-depth analysis of submissions. https://github.com/mbzuai-nlp/COLING-2025-Workshop-on-MGT-Detection-Task1
Related papers
- GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge [71.69373986176839]
We aim to answer whether models can detect generated text from a large, yet fixed, number of domains and LLMs.
Over the course of three months, our task was attempted by 9 teams with 23 detector submissions.
We find that multiple participants were able to obtain accuracies of over 99% on machine-generated text from RAID while maintaining a 5% False Positive Rate.
arXiv Detail & Related papers (2025-01-15T16:21:09Z) - GenAI Content Detection Task 2: AI vs. Human -- Academic Essay Authenticity Challenge [12.076440946525434]
The Academic Essay Authenticity Challenge was organized as part of the GenAI Content Detection shared tasks collocated with COLING 2025.
This challenge focuses on detecting machine-generated vs. human-authored essays for academic purposes.
The challenge involves two languages: English and Arabic.
This paper outlines the task formulation, details the dataset construction process, and explains the evaluation framework.
arXiv Detail & Related papers (2024-12-24T08:33:44Z) - SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - Wav2Gloss: Generating Interlinear Glossed Text from Speech [78.64412090339044]
We propose Wav2Gloss, a task in which four linguistic annotation components are extracted automatically from speech.
We provide various baselines to lay the groundwork for future research on Interlinear Glossed Text generation from speech.
arXiv Detail & Related papers (2024-03-19T21:45:29Z) - ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection
in Arabic Text [41.3267575540348]
We present an overview of the ArAIEval shared task, organized as part of the first Arabic 2023 conference co-located with EMNLP 2023.
ArAIEval offers two tasks over Arabic text: (i) persuasion technique detection, focusing on identifying persuasion techniques in tweets and news articles, and (ii) disinformation detection in binary and multiclass setups over tweets.
A total of 20 teams participated in the final evaluation phase, with 14 and 16 teams participating in Tasks 1 and 2, respectively.
arXiv Detail & Related papers (2023-11-06T15:21:19Z) - Findings of the The RuATD Shared Task 2022 on Artificial Text Detection
in Russian [6.9244605050142995]
We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022.
The dataset includes texts from 14 text generators, i.e., one human writer and 13 text generative models fine-tuned for one or more of the following generation tasks.
The human-written texts are collected from publicly available resources across multiple domains.
arXiv Detail & Related papers (2022-06-03T14:12:33Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Handshakes AI Research at CASE 2021 Task 1: Exploring different
approaches for multilingual tasks [0.22940141855172036]
The aim of the CASE 2021 Shared Task 1 was to detect and classify socio-political and crisis event information in a multilingual setting.
Our submission contained entries in all of the subtasks, and the scores obtained validated our research finding.
arXiv Detail & Related papers (2021-10-29T07:58:49Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.