On the Zero-Shot Generalization of Machine-Generated Text Detectors
- URL: http://arxiv.org/abs/2310.05165v1
- Date: Sun, 8 Oct 2023 13:49:51 GMT
- Title: On the Zero-Shot Generalization of Machine-Generated Text Detectors
- Authors: Xiao Pu, Jingyu Zhang, Xiaochuang Han, Yulia Tsvetkov, Tianxing He
- Abstract summary: Large language models are fluent enough to generate text indistinguishable from human-written language.
This work is motivated by an important research question: How will the detectors of machine-generated text perform on outputs of a new generator, that the detectors were not trained on?
- Score: 41.25534723956849
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rampant proliferation of large language models, fluent enough to generate
text indistinguishable from human-written language, gives unprecedented
importance to the detection of machine-generated text. This work is motivated
by an important research question: How will the detectors of machine-generated
text perform on outputs of a new generator, that the detectors were not trained
on? We begin by collecting generation data from a wide range of LLMs, and train
neural detectors on data from each generator and test its performance on
held-out generators. While none of the detectors can generalize to all
generators, we observe a consistent and interesting pattern that the detectors
trained on data from a medium-size LLM can zero-shot generalize to the larger
version. As a concrete application, we demonstrate that robust detectors can be
built on an ensemble of training data from medium-sized models.
Related papers
- Robust Detection of LLM-Generated Text: A Comparative Analysis [0.276240219662896]
Large language models can be widely integrated into many aspects of life, and their output can quickly fill all network resources.
It becomes increasingly important to develop powerful detectors for the generated text.
This detector is essential to prevent the potential misuse of these technologies and to protect areas such as social media from the negative effects.
arXiv Detail & Related papers (2024-11-09T18:27:15Z) - DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task.
Our development of DetectRL reveals the strengths and limitations of current SOTA detectors.
We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z) - Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - Zero-Shot Machine-Generated Text Detection Using Mixture of Large Language Models [35.67613230687864]
Large Language Models (LLMs) are trained at scale and endowed with powerful text-generating abilities.
We propose a new, theoretically grounded approach to combine their respective strengths.
Our experiments, using a variety of generator LLMs, suggest that our method effectively increases the robustness of detection.
arXiv Detail & Related papers (2024-09-11T20:55:12Z) - EAGLE: A Domain Generalization Framework for AI-generated Text Detection [15.254775341371364]
We propose a domain generalization framework for the detection of AI-generated text from unseen target generators.
We demonstrate how our framework effectively achieves impressive performance in detecting text generated by unseen target generators.
arXiv Detail & Related papers (2024-03-23T02:44:20Z) - Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text [98.28130949052313]
A score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text.
We propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs.
The method, called Binoculars, achieves state-of-the-art accuracy without any training data.
arXiv Detail & Related papers (2024-01-22T16:09:47Z) - M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box
Machine-Generated Text Detection [69.29017069438228]
Large language models (LLMs) have demonstrated remarkable capability to generate fluent responses to a wide variety of user queries.
This has also raised concerns about the potential misuse of such texts in journalism, education, and academia.
In this study, we strive to create automated systems that can detect machine-generated texts and pinpoint potential misuse.
arXiv Detail & Related papers (2023-05-24T08:55:11Z) - Smaller Language Models are Better Black-box Machine-Generated Text
Detectors [56.36291277897995]
Small and partially-trained models are better universal text detectors.
We find that whether the detector and generator were trained on the same data is not critically important to the detection success.
For instance, the OPT-125M model has an AUC of 0.81 in detecting ChatGPT generations, whereas a larger model from the GPT family, GPTJ-6B, has AUC of 0.45.
arXiv Detail & Related papers (2023-05-17T00:09:08Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.