Related papers: Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

URL: http://arxiv.org/abs/2407.11774v1
Date: Tue, 16 Jul 2024 14:33:01 GMT
Title: Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text
Authors: Seyedeh Fatemeh Ebrahimi, Karim Akhavan Azari, Amirmasoud Iravani, Arian Qazvini, Pouya Sadeghi, Zeinab Sadat Taghavi, Hossein Sameti,
Abstract summary: MGT has emerged as a significant area of study within Natural Language Processing. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection. Our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants.
Score: 2.2039952888743253
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural language models. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately discerning MGTs.

Related papers

Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection [71.59834293521074]
We develop a framework to distinguish between human-authored and machine-generated text.<n>Our method achieves 98.3% AUROC and AUPR with only 8.9% FPR95 on DeepFake dataset.<n>Code, pretrained weights, and demo will be released.
arXiv Detail & Related papers (2025-10-07T08:14:45Z)
SLRTP2025 Sign Language Production Challenge: Methodology, Results, and Future Work [87.9341538630949]
The first Sign Language Production Challenge was held as part of the third SLRTP Workshop at CVPR 2025.<n>The competition's aims are to evaluate architectures that translate from spoken language sentences to a sequence of skeleton poses.<n>This paper presents the challenge design and the winning methodologies.
arXiv Detail & Related papers (2025-08-09T11:57:33Z)
Evaluating Text Style Transfer: A Nine-Language Benchmark for Text Detoxification [66.69370876902222]
We perform the first comprehensive multilingual study on evaluation of text detoxification system across nine languages.<n>We assess the effectiveness of modern neural-based evaluation models alongside prompting-based LLM-as-a-judge approaches.<n>Our findings provide a practical recipe for designing more reliable multilingual TST evaluation pipeline.
arXiv Detail & Related papers (2025-07-21T12:38:07Z)
Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts. We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z)
Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text [7.959800630494841]
SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. We propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human. Our architecture ranked 46th on the official leaderboard with an accuracy of 80.83 among 125.
arXiv Detail & Related papers (2024-07-03T10:22:23Z)
SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z)
PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text? [4.463184061618504]
We present our submission to the SemEval-2024 Task 8 "Multigenerator, Multidomain, and Black-Box Machine-Generated Text Detection" Our approach relies on combining embeddings from the RoBERTa-base with diversity features and uses a resampled training set. Our results show that our approach is generalizable across unseen models and domains, achieving an accuracy of 0.91.
arXiv Detail & Related papers (2024-04-08T13:05:02Z)
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques [2.149586323955579]
Large Language Models (LLMs) generate fluent content across a wide spectrum of user queries. This capability has raised concerns regarding misinformation and personal information leakage. We present our methods for the SemEval2024 Task8, aiming to detect machine-generated text across various domains.
arXiv Detail & Related papers (2024-03-25T10:09:03Z)
Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents. Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z)
KInIT at SemEval-2024 Task 8: Fine-tuned LLMs for Multilingual Machine-Generated Text Detection [0.0]
SemEval-2024 Task 8 is focused on multigenerator, multidomain, and multilingual black-box machine-generated text detection. Our submitted method achieved competitive results, ranking at the fourth place, just under 1 percentage point behind the winner.
arXiv Detail & Related papers (2024-02-21T10:09:56Z)
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection [69.41274756177336]
Large Language Models (LLMs) have brought an unprecedented surge in machine-generated text (MGT) across diverse channels. This raises legitimate concerns about its potential misuse and societal implications. We introduce a new benchmark based on a multilingual, multi-domain, and multi-generator corpus of MGTs -- M4GT-Bench.
arXiv Detail & Related papers (2024-02-17T02:50:33Z)
MGTBench: Benchmarking Machine-Generated Text Detection [54.81446366272403]
This paper proposes the first benchmark framework for MGT detection against powerful large language models (LLMs) We show that a larger number of words in general leads to better performance and most detection methods can achieve similar performance with much fewer training samples. Our findings indicate that the model-based detection methods still perform well in the text attribution task.
arXiv Detail & Related papers (2023-03-26T21:12:36Z)
DIALOG-22 RuATD Generated Text Detection [0.0]
Detectors that can distinguish between TGM-generated text and human-written ones play an important role in preventing abuse of TGM. We describe our pipeline for the two DIALOG-22 RuATD tasks: detecting generated text (binary task) and classification of which model was used to generate text.
arXiv Detail & Related papers (2022-06-16T09:33:26Z)
Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models. Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.