Temperature Matters: Enhancing Watermark Robustness Against Paraphrasing Attacks
- URL: http://arxiv.org/abs/2506.22623v1
- Date: Fri, 27 Jun 2025 20:39:35 GMT
- Title: Temperature Matters: Enhancing Watermark Robustness Against Paraphrasing Attacks
- Authors: Badr Youbi Idrissi, Monica Millunzi, Amelia Sorrenti, Lorenzo Baraldi, Daryna Dementieva,
- Abstract summary: This research project is focused on the development of a novel methodology for the detection of synthetic text.<n>We propose an innovative watermarking approach and subject it to rigorous evaluation, employing paraphrased generated text to asses its robustness.
- Score: 21.416846120175368
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the present-day scenario, Large Language Models (LLMs) are establishing their presence as powerful instruments permeating various sectors of society. While their utility offers valuable support to individuals, there are multiple concerns over potential misuse. Consequently, some academic endeavors have sought to introduce watermarking techniques, characterized by the inclusion of markers within machine-generated text, to facilitate algorithmic identification. This research project is focused on the development of a novel methodology for the detection of synthetic text, with the overarching goal of ensuring the ethical application of LLMs in AI-driven text generation. The investigation commences with replicating findings from a previous baseline study, thereby underscoring its susceptibility to variations in the underlying generation model. Subsequently, we propose an innovative watermarking approach and subject it to rigorous evaluation, employing paraphrased generated text to asses its robustness. Experimental results highlight the robustness of our proposal compared to the~\cite{aarson} watermarking method.
Related papers
- BiMark: Unbiased Multilayer Watermarking for Large Language Models [54.58546293741373]
We propose BiMark, a novel watermarking framework that balances text quality preservation and message embedding capacity.<n>BiMark achieves up to 30% higher extraction rates for short texts while maintaining text quality indicated by lower perplexity.
arXiv Detail & Related papers (2025-06-19T11:08:59Z) - GaussMark: A Practical Approach for Structural Watermarking of Language Models [61.84270985214254]
GaussMark is a simple, efficient, and relatively robust scheme for watermarking large language models.<n>We show that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations.
arXiv Detail & Related papers (2025-01-17T22:30:08Z) - Signal Watermark on Large Language Models [28.711745671275477]
We propose a watermarking method embedding a specific watermark into the text during its generation by Large Language Models (LLMs)
This technique not only ensures the watermark's invisibility to humans but also maintains the quality and grammatical integrity of model-generated text.
Our method has been empirically validated across multiple LLMs, consistently maintaining high detection accuracy.
arXiv Detail & Related papers (2024-10-09T04:49:03Z) - On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks [20.972194348901958]
We first comb the mainstream watermarking schemes and removal attacks on machine-generated texts.<n>We evaluate eight watermarks (five pre-text, three post-text) and twelve attacks (two pre-text, ten post-text) across 87 scenarios.<n>Results indicate that KGW and Exponential watermarks offer high text quality and watermark retention but remain vulnerable to most attacks.
arXiv Detail & Related papers (2024-07-05T18:09:06Z) - Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated
Student Essay Detection [29.433764586753956]
Large language models (LLMs) have exhibited remarkable capabilities in text generation tasks.
The utilization of these models carries inherent risks, including but not limited to plagiarism, the dissemination of fake news, and issues in educational exercises.
This paper aims to bridge this gap by constructing AIG-ASAP, an AI-generated student essay dataset.
arXiv Detail & Related papers (2024-02-01T08:11:56Z) - A Survey of Text Watermarking in the Era of Large Language Models [91.36874607025909]
Text watermarking algorithms are crucial for protecting the copyright of textual content.
Recent advancements in large language models (LLMs) have revolutionized these techniques.
This paper conducts a comprehensive survey of the current state of text watermarking technology.
arXiv Detail & Related papers (2023-12-13T06:11:42Z) - Improving the Generation Quality of Watermarked Large Language Models
via Word Importance Scoring [81.62249424226084]
Token-level watermarking inserts watermarks in the generated texts by altering the token probability distributions.
This watermarking algorithm alters the logits during generation, which can lead to a downgraded text quality.
We propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS)
arXiv Detail & Related papers (2023-11-16T08:36:00Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - Watermarking Conditional Text Generation for AI Detection: Unveiling
Challenges and a Semantic-Aware Watermark Remedy [52.765898203824975]
We introduce a semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context.
Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models.
arXiv Detail & Related papers (2023-07-25T20:24:22Z) - Provable Robust Watermarking for AI-Generated Text [41.5510809722375]
We propose a robust and high-quality watermark method, Unigram-Watermark.
We prove that our watermark method enjoys guaranteed generation quality, correctness in watermark detection, and is robust against text editing and paraphrasing.
arXiv Detail & Related papers (2023-06-30T07:24:32Z) - DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for
Identifying Large Language Model Generated Text [1.249418440326334]
The importance of discerning whether texts are human-authored or generated by Large Language Models has become paramount.
DeepTextMark offers a viable "add-on" solution to prevailing text generation frameworks, requiring no direct access or alterations to the underlying text generation mechanism.
Experimental evaluations underscore the high imperceptibility, elevated detection accuracy, augmented robustness, reliability, and swift execution of DeepTextMark.
arXiv Detail & Related papers (2023-05-09T21:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.