DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for
Identifying Large Language Model Generated Text
- URL: http://arxiv.org/abs/2305.05773v2
- Date: Mon, 11 Mar 2024 06:27:26 GMT
- Title: DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for
Identifying Large Language Model Generated Text
- Authors: Travis Munyer, Abdullah Tanvir, Arjon Das, Xin Zhong
- Abstract summary: The importance of discerning whether texts are human-authored or generated by Large Language Models has become paramount.
DeepTextMark offers a viable "add-on" solution to prevailing text generation frameworks, requiring no direct access or alterations to the underlying text generation mechanism.
Experimental evaluations underscore the high imperceptibility, elevated detection accuracy, augmented robustness, reliability, and swift execution of DeepTextMark.
- Score: 1.249418440326334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid advancement of Large Language Models (LLMs) has significantly
enhanced the capabilities of text generators. With the potential for misuse
escalating, the importance of discerning whether texts are human-authored or
generated by LLMs has become paramount. Several preceding studies have ventured
to address this challenge by employing binary classifiers to differentiate
between human-written and LLM-generated text. Nevertheless, the reliability of
these classifiers has been subject to question. Given that consequential
decisions may hinge on the outcome of such classification, it is imperative
that text source detection is of high caliber. In light of this, the present
paper introduces DeepTextMark, a deep learning-driven text watermarking
methodology devised for text source identification. By leveraging Word2Vec and
Sentence Encoding for watermark insertion, alongside a transformer-based
classifier for watermark detection, DeepTextMark epitomizes a blend of
blindness, robustness, imperceptibility, and reliability. As elaborated within
the paper, these attributes are crucial for universal text source detection,
with a particular emphasis in this paper on text produced by LLMs. DeepTextMark
offers a viable "add-on" solution to prevailing text generation frameworks,
requiring no direct access or alterations to the underlying text generation
mechanism. Experimental evaluations underscore the high imperceptibility,
elevated detection accuracy, augmented robustness, reliability, and swift
execution of DeepTextMark.
Related papers
- Unveiling Large Language Models Generated Texts: A Multi-Level Fine-Grained Detection Framework [9.976099891796784]
Large language models (LLMs) have transformed human writing by enhancing grammar correction, content expansion, and stylistic refinement.
Existing detection methods, which mainly rely on single-feature analysis and binary classification, often fail to effectively identify LLM-generated text in academic contexts.
We propose a novel Multi-level Fine-grained Detection framework that detects LLM-generated text by integrating low-level structural, high-level semantic, and deep-level linguistic features.
arXiv Detail & Related papers (2024-10-18T07:25:00Z) - Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs)
This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text.
Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z) - Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks [20.972194348901958]
We first comb the mainstream watermarking schemes and removal attacks on machine-generated texts.
We evaluate eight watermarks (five pre-text, three post-text) and twelve attacks (two pre-text, ten post-text) across 87 scenarios.
Results indicate that KGW and Exponential watermarks offer high text quality and watermark retention but remain vulnerable to most attacks.
arXiv Detail & Related papers (2024-07-05T18:09:06Z) - Topic-Based Watermarks for LLM-Generated Text [46.71493672772134]
This paper proposes a novel topic-based watermarking algorithm for large language models (LLMs)
By using topic-specific token biases, we embed a topic-sensitive watermarking into the generated text.
We demonstrate that our proposed watermarking scheme classifies various watermarked text topics with 99.99% confidence.
arXiv Detail & Related papers (2024-04-02T17:49:40Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism.
Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs.
We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z) - Watermarking Conditional Text Generation for AI Detection: Unveiling
Challenges and a Semantic-Aware Watermark Remedy [52.765898203824975]
We introduce a semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context.
Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models.
arXiv Detail & Related papers (2023-07-25T20:24:22Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - Adversarial Watermarking Transformer: Towards Tracing Text Provenance
with Data Hiding [80.3811072650087]
We study natural language watermarking as a defense to help better mark and trace the provenance of text.
We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training.
AWT is the first end-to-end model to hide data in text by automatically learning -- without ground truth -- word substitutions along with their locations.
arXiv Detail & Related papers (2020-09-07T11:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.