EAGLE: A Domain Generalization Framework for AI-generated Text Detection
- URL: http://arxiv.org/abs/2403.15690v1
- Date: Sat, 23 Mar 2024 02:44:20 GMT
- Title: EAGLE: A Domain Generalization Framework for AI-generated Text Detection
- Authors: Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu,
- Abstract summary: We propose a domain generalization framework for the detection of AI-generated text from unseen target generators.
We demonstrate how our framework effectively achieves impressive performance in detecting text generated by unseen target generators.
- Score: 15.254775341371364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the advancement in capabilities of Large Language Models (LLMs), one major step in the responsible and safe use of such LLMs is to be able to detect text generated by these models. While supervised AI-generated text detectors perform well on text generated by older LLMs, with the frequent release of new LLMs, building supervised detectors for identifying text from such new models would require new labeled training data, which is infeasible in practice. In this work, we tackle this problem and propose a domain generalization framework for the detection of AI-generated text from unseen target generators. Our proposed framework, EAGLE, leverages the labeled data that is available so far from older language models and learns features invariant across these generators, in order to detect text generated by an unknown target generator. EAGLE learns such domain-invariant features by combining the representational power of self-supervised contrastive learning with domain adversarial training. Through our experiments we demonstrate how EAGLE effectively achieves impressive performance in detecting text generated by unseen target generators, including recent state-of-the-art ones such as GPT-4 and Claude, reaching detection scores of within 4.7% of a fully supervised detector.
Related papers
- DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task.
Our development of DetectRL reveals the strengths and limitations of current SOTA detectors.
We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z) - GigaCheck: Detecting LLM-generated Content [72.27323884094953]
In this work, we investigate the task of generated text detection by proposing the GigaCheck.
Our research explores two approaches: (i) distinguishing human-written texts from LLM-generated ones, and (ii) detecting LLM-generated intervals in Human-Machine collaborative texts.
Specifically, we use a fine-tuned general-purpose LLM in conjunction with a DETR-like detection model, adapted from computer vision, to localize artificially generated intervals within text.
arXiv Detail & Related papers (2024-10-31T08:30:55Z) - Detecting AI-Generated Texts in Cross-Domains [3.2245324254437846]
We train a ranking classifier called RoBERTa-Ranker as a baseline model.
We then present a method to fine-tune RoBERTa-Ranker that requires only a small amount of labeled data in a new domain.
Experiments show that this fine-tuned domain-aware model outperforms the popular DetectGPT and GPTZero.
arXiv Detail & Related papers (2024-10-17T18:43:30Z) - Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - CUDRT: Benchmarking the Detection of Human vs. Large Language Models Generated Texts [10.027843402296678]
This paper constructs a comprehensive benchmark in both Chinese and English to evaluate mainstream AI-generated text detectors.
We categorize text generation into five distinct operations: Create, Update, Delete, Rewrite, and Translate.
For each CUDRT category, we have developed extensive datasets to thoroughly assess detector performance.
arXiv Detail & Related papers (2024-06-13T12:43:40Z) - Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text [8.290557547578146]
We introduce a novel system, T5LLMCipher, for detecting machine-generated text using a pretrained T5 encoder combined with LLM embedding sub-clustering.
We find that our approach provides state-of-the-art generalization ability, with an average increase in F1 score on machine-generated text of 19.6% on unseen generators and domains.
arXiv Detail & Related papers (2024-01-17T18:45:13Z) - SeqXGPT: Sentence-Level AI-Generated Text Detection [62.3792779440284]
We introduce a sentence-level detection challenge by synthesizing documents polished with large language models (LLMs)
We then propose textbfSequence textbfX (Check) textbfGPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection.
arXiv Detail & Related papers (2023-10-13T07:18:53Z) - On the Zero-Shot Generalization of Machine-Generated Text Detectors [41.25534723956849]
Large language models are fluent enough to generate text indistinguishable from human-written language.
This work is motivated by an important research question: How will the detectors of machine-generated text perform on outputs of a new generator, that the detectors were not trained on?
arXiv Detail & Related papers (2023-10-08T13:49:51Z) - LLMDet: A Third Party Large Language Models Generated Text Detection
Tool [119.0952092533317]
Large language models (LLMs) are remarkably close to high-quality human-authored text.
Existing detection tools can only differentiate between machine-generated and human-authored text.
We propose LLMDet, a model-specific, secure, efficient, and extendable detection tool.
arXiv Detail & Related papers (2023-05-24T10:45:16Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.