Related papers: DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text

DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text

URL: http://arxiv.org/abs/2305.05773v2
Date: Mon, 11 Mar 2024 06:27:26 GMT
Title: DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text
Authors: Travis Munyer, Abdullah Tanvir, Arjon Das, Xin Zhong
Abstract summary: The importance of discerning whether texts are human-authored or generated by Large Language Models has become paramount. DeepTextMark offers a viable "add-on" solution to prevailing text generation frameworks, requiring no direct access or alterations to the underlying text generation mechanism. Experimental evaluations underscore the high imperceptibility, elevated detection accuracy, augmented robustness, reliability, and swift execution of DeepTextMark.
Score: 1.249418440326334
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid advancement of Large Language Models (LLMs) has significantly enhanced the capabilities of text generators. With the potential for misuse escalating, the importance of discerning whether texts are human-authored or generated by LLMs has become paramount. Several preceding studies have ventured to address this challenge by employing binary classifiers to differentiate between human-written and LLM-generated text. Nevertheless, the reliability of these classifiers has been subject to question. Given that consequential decisions may hinge on the outcome of such classification, it is imperative that text source detection is of high caliber. In light of this, the present paper introduces DeepTextMark, a deep learning-driven text watermarking methodology devised for text source identification. By leveraging Word2Vec and Sentence Encoding for watermark insertion, alongside a transformer-based classifier for watermark detection, DeepTextMark epitomizes a blend of blindness, robustness, imperceptibility, and reliability. As elaborated within the paper, these attributes are crucial for universal text source detection, with a particular emphasis in this paper on text produced by LLMs. DeepTextMark offers a viable "add-on" solution to prevailing text generation frameworks, requiring no direct access or alterations to the underlying text generation mechanism. Experimental evaluations underscore the high imperceptibility, elevated detection accuracy, augmented robustness, reliability, and swift execution of DeepTextMark.

Related papers

Temperature Matters: Enhancing Watermark Robustness Against Paraphrasing Attacks [21.416846120175368]
This research project is focused on the development of a novel methodology for the detection of synthetic text.<n>We propose an innovative watermarking approach and subject it to rigorous evaluation, employing paraphrased generated text to asses its robustness.
arXiv Detail & Related papers (2025-06-27T20:39:35Z)
BiMark: Unbiased Multilayer Watermarking for Large Language Models [54.58546293741373]
We propose BiMark, a novel watermarking framework that balances text quality preservation and message embedding capacity.<n>BiMark achieves up to 30% higher extraction rates for short texts while maintaining text quality indicated by lower perplexity.
arXiv Detail & Related papers (2025-06-19T11:08:59Z)
A Nested Watermark for Large Language Models [6.702383792532788]
Large language models (LLMs) can be misused to generate fake news and misinformation.<n>We propose a novel nested watermarking scheme that embeds two distinct watermarks into the generated text.<n>Our method achieves high detection accuracy for both watermarks while maintaining the fluency and overall quality of the generated text.
arXiv Detail & Related papers (2025-06-18T05:49:05Z)
Unveiling Large Language Models Generated Texts: A Multi-Level Fine-Grained Detection Framework [9.976099891796784]
Large language models (LLMs) have transformed human writing by enhancing grammar correction, content expansion, and stylistic refinement. Existing detection methods, which mainly rely on single-feature analysis and binary classification, often fail to effectively identify LLM-generated text in academic contexts. We propose a novel Multi-level Fine-grained Detection framework that detects LLM-generated text by integrating low-level structural, high-level semantic, and deep-level linguistic features.
arXiv Detail & Related papers (2024-10-18T07:25:00Z)
Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs) This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text. Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z)
Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts. We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z)
On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks [20.972194348901958]
We first comb the mainstream watermarking schemes and removal attacks on machine-generated texts. We evaluate eight watermarks (five pre-text, three post-text) and twelve attacks (two pre-text, ten post-text) across 87 scenarios. Results indicate that KGW and Exponential watermarks offer high text quality and watermark retention but remain vulnerable to most attacks.
arXiv Detail & Related papers (2024-07-05T18:09:06Z)
Topic-Based Watermarks for LLM-Generated Text [46.71493672772134]
This paper proposes a novel topic-based watermarking algorithm for large language models (LLMs) By using topic-specific token biases, we embed a topic-sensitive watermarking into the generated text. We demonstrate that our proposed watermarking scheme classifies various watermarked text topics with 99.99% confidence.
arXiv Detail & Related papers (2024-04-02T17:49:40Z)
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [31.062753031312006]
Large language models generate high-quality responses with potential misinformation. Watermarking is pivotal in this context, which involves embedding hidden markers in texts. We introduce a novel multi-objective optimization (MOO) approach for watermarking. Our method simultaneously achieves detectability and semantic integrity.
arXiv Detail & Related papers (2024-02-28T05:43:22Z)
Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions. This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality. We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z)
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism. Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs. We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z)
Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy [52.765898203824975]
We introduce a semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context. Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models.
arXiv Detail & Related papers (2023-07-25T20:24:22Z)
DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection [56.513637720967566]
Large language models (LLMs) can generate texts that pose risks of misuse, such as plagiarism, planting fake reviews on e-commerce platforms, or creating inflammatory false tweets. Existing high-quality detection methods usually require access to the interior of the model to extract the intrinsic characteristics. We propose to extract deep intrinsic characteristics of the black-box model generated texts.
arXiv Detail & Related papers (2023-05-21T17:26:16Z)
Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc. Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques. In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding [80.3811072650087]
We study natural language watermarking as a defense to help better mark and trace the provenance of text. We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training. AWT is the first end-to-end model to hide data in text by automatically learning -- without ground truth -- word substitutions along with their locations.
arXiv Detail & Related papers (2020-09-07T11:01:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.