Learning to Rewrite: Generalized LLM-Generated Text Detection
- URL: http://arxiv.org/abs/2408.04237v1
- Date: Thu, 8 Aug 2024 05:53:39 GMT
- Title: Learning to Rewrite: Generalized LLM-Generated Text Detection
- Authors: Wei Hao, Ran Li, Weiliang Zhao, Junfeng Yang, Chengzhi Mao,
- Abstract summary: Large language models (LLMs) can be abused at scale to create non-factual content and spread disinformation.
We propose training an LLM to rewrite input text, producing minimal edits for LLM-generated content and more edits for human-written text.
Our work suggests that LLM can effectively detect machine-generated text if they are trained properly.
- Score: 19.9477991969521
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) can be abused at scale to create non-factual content and spread disinformation. Detecting LLM-generated content is essential to mitigate these risks, but current classifiers often fail to generalize in open-world contexts. Prior work shows that LLMs tend to rewrite LLM-generated content less frequently, which can be used for detection and naturally generalizes to unforeseen data. However, we find that the rewriting edit distance between human and LLM content can be indistinguishable across domains, leading to detection failures. We propose training an LLM to rewrite input text, producing minimal edits for LLM-generated content and more edits for human-written text, deriving a distinguishable and generalizable edit distance difference across different domains. Experiments on text from 21 independent domains and three popular LLMs (e.g., GPT-4o, Gemini, and Llama-3) show that our classifier outperforms the state-of-the-art zero-shot classifier by up to 20.6% on AUROC score and the rewriting classifier by 9.2% on F1 score. Our work suggests that LLM can effectively detect machine-generated text if they are trained properly.
Related papers
- "I know myself better, but not really greatly": Using LLMs to Detect and Explain LLM-Generated Texts [10.454446545249096]
Large language models (LLMs) have demonstrated impressive capabilities in generating human-like texts.
This paper explores the detection and explanation capabilities of LLM-based detectors of human-generated texts.
arXiv Detail & Related papers (2025-02-18T11:00:28Z) - Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection [60.09665704993751]
We introduce FairOPT, an algorithm for group-specific threshold optimization in AI-generated content classifiers.
Our approach partitions data into subgroups based on attributes (e.g., text length and writing style) and learns decision thresholds for each group.
Our framework paves the way for more robust and fair classification criteria in AI-generated output detection.
arXiv Detail & Related papers (2025-02-06T21:58:48Z) - Robust Detection of LLM-Generated Text: A Comparative Analysis [0.276240219662896]
Large language models can be widely integrated into many aspects of life, and their output can quickly fill all network resources.
It becomes increasingly important to develop powerful detectors for the generated text.
This detector is essential to prevent the potential misuse of these technologies and to protect areas such as social media from the negative effects.
arXiv Detail & Related papers (2024-11-09T18:27:15Z) - Unveiling Large Language Models Generated Texts: A Multi-Level Fine-Grained Detection Framework [9.976099891796784]
Large language models (LLMs) have transformed human writing by enhancing grammar correction, content expansion, and stylistic refinement.
Existing detection methods, which mainly rely on single-feature analysis and binary classification, often fail to effectively identify LLM-generated text in academic contexts.
We propose a novel Multi-level Fine-grained Detection framework that detects LLM-generated text by integrating low-level structural, high-level semantic, and deep-level linguistic features.
arXiv Detail & Related papers (2024-10-18T07:25:00Z) - Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - LLM-Detector: Improving AI-Generated Chinese Text Detection with
Open-Source LLM Instruction Tuning [4.328134379418151]
Existing AI-generated text detection models are prone to in-domain over-fitting.
We propose LLM-Detector, a novel method for both document-level and sentence-level text detection.
arXiv Detail & Related papers (2024-02-02T05:54:12Z) - SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs)
We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z) - RADAR: Robust AI-Text Detection via Adversarial Learning [69.5883095262619]
RADAR is based on adversarial training of a paraphraser and a detector.
The paraphraser's goal is to generate realistic content to evade AI-text detection.
RADAR uses the feedback from the detector to update the paraphraser, and vice versa.
arXiv Detail & Related papers (2023-07-07T21:13:27Z) - Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to
Document Level [4.250876580245865]
Existing AI-generated text classifiers have limited accuracy and often produce false positives.
We propose a novel approach using natural language processing (NLP) techniques.
We generate multiple paraphrased versions of a given question and inputting them into the large language model to generate answers.
By using a contrastive loss function based on cosine similarity, we match generated sentences with those from the student's response.
arXiv Detail & Related papers (2023-06-13T20:34:55Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - Can AI-Generated Text be Reliably Detected? [50.95804851595018]
Large Language Models (LLMs) perform impressively well in various applications.
The potential for misuse of these models in activities such as plagiarism, generating fake news, and spamming has raised concern about their responsible use.
We stress-test the robustness of these AI text detectors in the presence of an attacker.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.