Related papers: Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

URL: http://arxiv.org/abs/2602.21593v1
Date: Wed, 25 Feb 2026 05:38:08 GMT
Title: Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection
Authors: Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang,
Abstract summary: Traditional noise-layer-based watermarking remains vulnerable to inversion attacks that can recover embedded signals.<n>Recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence.<n>We introduce a Coherence-Preserving Semantic Injection (CSI) attack that leverages LLM-guided semantic manipulation under embedding-space similarity constraints.
Score: 6.443002210168185
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative images have proliferated on Web platforms in social media and online copyright distribution scenarios, and semantic watermarking has increasingly been integrated into diffusion models to support reliable provenance tracking and forgery prevention for web content. Traditional noise-layer-based watermarking, however, remains vulnerable to inversion attacks that can recover embedded signals. To mitigate this, recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence. Yet, large language models (LLMs) possess structured reasoning capabilities that enable targeted exploration of semantic spaces, allowing locally fine-grained but globally coherent semantic alterations that invalidate such bindings. To expose this overlooked vulnerability, we introduce a Coherence-Preserving Semantic Injection (CSI) attack that leverages LLM-guided semantic manipulation under embedding-space similarity constraints. This alignment enforces visual-semantic consistency while selectively perturbing watermark-relevant semantics, ultimately inducing detector misclassification. Extensive empirical results show that CSI consistently outperforms prevailing attack baselines against content-aware semantic watermarking, revealing a fundamental security weakness of current semantic watermark designs when confronted with LLM-driven semantic perturbations.

Related papers

SemBind: Binding Diffusion Watermarks to Semantics Against Black-Box Forgery Attacks [74.76909939060833]
Black-box forgery attacks pose outsized risk to provenance and trust.<n>We propose SemBind, a framework for latent-based watermarks that resists black-box forgery.<n>We show that SemBind-enabled anti-forgery variants markedly reduce false acceptance under black-box forgery.
arXiv Detail & Related papers (2026-01-28T07:02:40Z)
Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection [22.992750993168404]
We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection.<n>We design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key.<n>Experiments show that PAI 98.43% verification accuracy, improving over SOTA methods by 37.25% on average, and retains strong tampering localization performance even against advanced AIGC edits.
arXiv Detail & Related papers (2026-01-10T17:49:08Z)
From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection [24.55335024940469]
Embeddings-as-a-Service (E) has emerged as a successful commercial paradigm on the web platform.<n>Prior studies have revealed that E is vulnerable to imitation attacks.<n>We propose SemMark, a novel semantic-based watermarking paradigm for E copyright protection.
arXiv Detail & Related papers (2025-12-18T11:50:38Z)
StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models [55.05404953041403]
We propose a novel framework that seamlessly integrates a binary watermark into the diffusion generation process.<n>We show that StableGuard consistently outperforms state-of-the-art methods in image fidelity, watermark verification, and tampering localization.
arXiv Detail & Related papers (2025-09-22T16:35:19Z)
Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning [34.76886510334969]
A piggyback attack can maliciously alter the meaning of watermarked text-transforming it into hate speech-while preserving the original watermark.<n>We propose a semantic-aware watermarking algorithm that embeds watermarks into a given target text while preserving its original meaning.
arXiv Detail & Related papers (2025-04-09T04:38:17Z)
Your Semantic-Independent Watermark is Fragile: A Semantic Perturbation Attack against EaaS Watermark [5.2431999629987]
Various studies have proposed backdoor-based watermarking schemes to protect the copyright of E services.<n>In this paper, we reveal that previous watermarking schemes possess semantic-independent characteristics and propose the Semantic Perturbation Attack (SPA)<n>Our theoretical and experimental analysis demonstrate that this semantic-independent nature makes current watermarking schemes vulnerable to adaptive attacks that exploit semantic perturbations tests to bypass watermark verification.
arXiv Detail & Related papers (2024-11-14T11:06:34Z)
MarkPlugger: Generalizable Watermark Framework for Latent Diffusion Models without Retraining [48.41130825143742]
In the fast-evolving era of AI-generated content (AIGC), the rapid iteration and modification of latent diffusion models (LDMs) makes retraining with watermark models costly.<n>We propose MarkPlugger, a generalizable plug-and-play watermark framework without LDM retraining.<n>Our experimental findings reveal that our method effectively harmonizes image quality and watermark recovery rate.
arXiv Detail & Related papers (2024-04-08T15:29:46Z)
WatME: Towards Lossless Watermarking Through Lexical Redundancy [58.61972059246715]
This study assesses the impact of watermarking on different capabilities of large language models (LLMs) from a cognitive science lens. We introduce Watermarking with Mutual Exclusion (WatME) to seamlessly integrate watermarks.
arXiv Detail & Related papers (2023-11-16T11:58:31Z)
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing [50.84892876636013]
Large language models (LLMs) have show great ability in various natural language tasks. There are concerns that LLMs are possible to be used improperly or even illegally. We propose a semantics-based watermark framework SemaMark.
arXiv Detail & Related papers (2023-11-15T06:19:02Z)
Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack. We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.