Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective
- URL: http://arxiv.org/abs/2511.00988v1
- Date: Sun, 02 Nov 2025 15:59:31 GMT
- Title: Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective
- Authors: Chenwang Wu, Yiu-ming Cheung, Bo Han, Defu Lian,
- Abstract summary: Existing machine-generated text (MGT) detection methods implicitly assume labels as the "golden standard"<n>We propose an easy-to-hard enhancement framework to provide reliable supervision under such inexact conditions.
- Score: 108.30620357325559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing machine-generated text (MGT) detection methods implicitly assume labels as the "golden standard". However, we reveal boundary ambiguity in MGT detection, implying that traditional training paradigms are inexact. Moreover, limitations of human cognition and the superintelligence of detectors make inexact learning widespread and inevitable. To this end, we propose an easy-to-hard enhancement framework to provide reliable supervision under such inexact conditions. Distinct from knowledge distillation, our framework employs an easy supervisor targeting relatively simple longer-text detection tasks (despite weaker capabilities), to enhance the more challenging target detector. Firstly, longer texts targeted by supervisors theoretically alleviate the impact of inexact labels, laying the foundation for reliable supervision. Secondly, by structurally incorporating the detector into the supervisor, we theoretically model the supervisor as a lower performance bound for the detector. Thus, optimizing the supervisor indirectly optimizes the detector, ultimately approximating the underlying "golden" labels. Extensive experiments across diverse practical scenarios, including cross-LLM, cross-domain, mixed text, and paraphrase attacks, demonstrate the framework's significant detection effectiveness. The code is available at: https://github.com/tmlr-group/Easy2Hard.
Related papers
- Beyond Raw Detection Scores: Markov-Informed Calibration for Boosting Machine-Generated Text Detection [105.14032334647932]
Machine-generated texts (MGTs) pose risks such as disinformation and phishing, highlighting the need for reliable detection.<n> Metric-based methods, which extract statistically distinguishable features of MGTs, are often more practical than complex model-based methods that are prone to overfitting.<n>We propose a Markov-informed score calibration strategy that models two relationships of context detection scores that may aid calibration.
arXiv Detail & Related papers (2026-02-08T16:06:12Z) - Can We Trust LLM Detectors? [7.046352335920807]
Training-free and supervised AI text detectors are brittle under distribution shift, unseen generators, and simple stylistic perturbations.<n>We propose a supervised contrastive learning framework that learns discriminative style embeddings.<n>Experiments show that while supervised detectors excel in-domain, they degrade sharply out-of-domain, and training-free methods remain highly sensitive to proxy choice.
arXiv Detail & Related papers (2026-01-09T04:53:06Z) - Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection [76.91230292971115]
Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks.<n>XG-Guard is an explainable and fine-grained safeguarding framework for detecting malicious agents in MAS.
arXiv Detail & Related papers (2025-12-21T13:46:36Z) - Diversity Boosts AI-Generated Text Detection [51.56484100374058]
DivEye is a novel framework that captures how unpredictability fluctuates across a text using surprisal-based features.<n>Our method outperforms existing zero-shot detectors by up to 33.2% and achieves competitive performance with fine-tuned baselines.
arXiv Detail & Related papers (2025-09-23T10:21:22Z) - DetectAnyLLM: Towards Generalizable and Robust Detection of Machine-Generated Text Across Domains and Models [60.713908578319256]
We propose Direct Discrepancy Learning (DDL) to optimize the detector with task-oriented knowledge.<n>Built upon this, we introduce DetectAnyLLM, a unified detection framework that achieves state-of-the-art MGTD performance.<n>MIRAGE samples human-written texts from 10 corpora across 5 text-domains, which are then re-generated or revised using 17 cutting-edge LLMs.
arXiv Detail & Related papers (2025-09-15T10:59:57Z) - Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors [4.7713095161046555]
We present a pipeline to test the resilience of state-of-the-art MGT detectors to linguistically informed adversarial attacks.<n>We fine-tune language models to shift the MGT style toward human-written text (HWT)<n>This exploits the detectors' reliance on stylistic clues, making new generations more challenging to detect.
arXiv Detail & Related papers (2025-05-30T12:33:30Z) - Detection Latencies of Anomaly Detectors: An Overlooked Perspective ? [1.8492669447784602]
In this paper, we argue the relevance of measuring the temporal latency of attacks and errors.
We propose an evaluation approach for detectors to ensure a pragmatic trade-off between correct and in-time detection.
arXiv Detail & Related papers (2024-02-14T10:52:39Z) - Hard-normal Example-aware Template Mutual Matching for Industrial Anomaly Detection [78.734927709231]
Anomaly detectors are widely used in industrial manufacturing to detect and localize unknown defects in query images.<n>These detectors are trained on anomaly-free samples and have successfully distinguished anomalies from most normal samples.<n>However, hard-normal examples are scattered and far apart from most normal samples, and thus they are often mistaken for anomalies by existing methods.
arXiv Detail & Related papers (2023-03-28T17:54:56Z) - MGTBench: Benchmarking Machine-Generated Text Detection [54.81446366272403]
This paper proposes the first benchmark framework for MGT detection against powerful large language models (LLMs)
We show that a larger number of words in general leads to better performance and most detection methods can achieve similar performance with much fewer training samples.
Our findings indicate that the model-based detection methods still perform well in the text attribution task.
arXiv Detail & Related papers (2023-03-26T21:12:36Z) - LEDetection: A Simple Framework for Semi-Supervised Few-Shot Object
Detection [4.3512163406552]
This paper studies the new task of semi-supervised FSOD by considering a realistic scenario in which both base and novel labels are simultaneously scarce.
We introduce SoftER Teacher, a robust detector combining pseudo-labeling with consistency learning on region proposals.
Rigorous experiments show that SoftER Teacher surpasses the novel performance of a strong supervised detector using only 10% of required base labels.
arXiv Detail & Related papers (2023-03-10T06:49:31Z) - DetectorGuard: Provably Securing Object Detectors against Localized
Patch Hiding Attacks [28.94435153159868]
State-of-the-art object detectors are vulnerable to localized patch hiding attacks.
We propose the first general framework for building provably robust detectors against the localized patch hiding attack called DetectorGuard.
arXiv Detail & Related papers (2021-02-05T02:02:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.