Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
- URL: http://arxiv.org/abs/2409.16914v1
- Date: Wed, 25 Sep 2024 13:18:57 GMT
- Title: Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
- Authors: Shixuan Ma, Quan Wang,
- Abstract summary: We develop a generic dual-channel detection paradigm that uses token cohesiveness as a plug-and-play module to improve existing zero-shot detectors.
To calculate token cohesiveness, we use a few rounds of random token deletion and semantic difference measurement.
Experiments with four state-of-the-art base detectors on various datasets, source models, and evaluation settings demonstrate the effectiveness and generality of the proposed approach.
- Score: 6.229124658686219
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The increasing capability and widespread usage of large language models (LLMs) highlight the desirability of automatic detection of LLM-generated text. Zero-shot detectors, due to their training-free nature, have received considerable attention and notable success. In this paper, we identify a new feature, token cohesiveness, that is useful for zero-shot detection, and we demonstrate that LLM-generated text tends to exhibit higher token cohesiveness than human-written text. Based on this observation, we devise TOCSIN, a generic dual-channel detection paradigm that uses token cohesiveness as a plug-and-play module to improve existing zero-shot detectors. To calculate token cohesiveness, TOCSIN only requires a few rounds of random token deletion and semantic difference measurement, making it particularly suitable for a practical black-box setting where the source model used for generation is not accessible. Extensive experiments with four state-of-the-art base detectors on various datasets, source models, and evaluation settings demonstrate the effectiveness and generality of the proposed approach. Code available at: \url{https://github.com/Shixuan-Ma/TOCSIN}.
Related papers
- Training-free LLM-generated Text Detection by Mining Token Probability Sequences [18.955509967889782]
Large language models (LLMs) have demonstrated remarkable capabilities in generating high-quality texts across diverse domains.
Training-free methods, which focus on inherent discrepancies through carefully designed statistical features, offer improved generalization and interpretability.
We introduce a novel training-free detector, termed textbfLastde that synergizes local and global statistics for enhanced detection.
arXiv Detail & Related papers (2024-10-08T14:23:45Z) - Zero-Shot Machine-Generated Text Detection Using Mixture of Large Language Models [35.67613230687864]
Large Language Models (LLMs) are trained at scale and endowed with powerful text-generating abilities.
We propose a new, theoretically grounded approach to combine their respective strengths.
Our experiments, using a variety of generator LLMs, suggest that our method effectively increases the robustness of detection.
arXiv Detail & Related papers (2024-09-11T20:55:12Z) - Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore [51.65730053591696]
We propose a simple but effective black-box zero-shot detection approach.
It is predicated on the observation that human-written texts typically contain more grammatical errors than LLM-generated texts.
Our method achieves an average AUROC of 98.7% and shows strong robustness against paraphrase and adversarial perturbation attacks.
arXiv Detail & Related papers (2024-05-07T12:57:01Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Zero-Shot Detection of Machine-Generated Codes [83.0342513054389]
This work proposes a training-free approach for the detection of LLMs-generated codes.
We find that existing training-based or zero-shot text detectors are ineffective in detecting code.
Our method exhibits robustness against revision attacks and generalizes well to Java codes.
arXiv Detail & Related papers (2023-10-08T10:08:21Z) - DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection [56.513637720967566]
Large language models (LLMs) can generate texts that pose risks of misuse, such as plagiarism, planting fake reviews on e-commerce platforms, or creating inflammatory false tweets.
Existing high-quality detection methods usually require access to the interior of the model to extract the intrinsic characteristics.
We propose to extract deep intrinsic characteristics of the black-box model generated texts.
arXiv Detail & Related papers (2023-05-21T17:26:16Z) - Large Language Models can be Guided to Evade AI-Generated Text Detection [40.7707919628752]
Large language models (LLMs) have shown remarkable performance in various tasks and have been extensively utilized by the public.
We equip LLMs with prompts, rather than relying on an external paraphraser, to evaluate the vulnerability of these detectors.
We propose a novel Substitution-based In-Context example optimization method (SICO) to automatically construct prompts for evading the detectors.
arXiv Detail & Related papers (2023-05-18T10:03:25Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability
Curvature [143.5381108333212]
We show that text sampled from an large language model tends to occupy negative curvature regions of the model's log probability function.
We then define a new curvature-based criterion for judging if a passage is generated from a given LLM.
We find DetectGPT is more discriminative than existing zero-shot methods for model sample detection.
arXiv Detail & Related papers (2023-01-26T18:44:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.