GTSD: Generative Text Steganography Based on Diffusion Model
- URL: http://arxiv.org/abs/2504.19433v1
- Date: Mon, 28 Apr 2025 02:42:52 GMT
- Title: GTSD: Generative Text Steganography Based on Diffusion Model
- Authors: Zhengxian Wu, Juan Wen, Yiming Xue, Ziwei Zhang, Yinghan Zhou,
- Abstract summary: We propose a generative text steganography method based on a diffusion model (GTSD), which improves generative speed, robustness, and imperceptibility while maintaining security.<n>Prompt mapping maps secret information into a conditional prompt to guide the pre-trained diffusion model generating batches of candidate sentences.<n>The batch mapping selects stego text based on secret information from batches of candidate sentences.
- Score: 21.996779455449094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid development of deep learning, existing generative text steganography methods based on autoregressive models have achieved success. However, these autoregressive steganography approaches have certain limitations. Firstly, existing methods require encoding candidate words according to their output probability and generating each stego word one by one, which makes the generation process time-consuming. Secondly, encoding and selecting candidate words changes the sampling probabilities, resulting in poor imperceptibility of the stego text. Thirdly, existing methods have low robustness and cannot resist replacement attacks. To address these issues, we propose a generative text steganography method based on a diffusion model (GTSD), which improves generative speed, robustness, and imperceptibility while maintaining security. To be specific, a novel steganography scheme based on diffusion model is proposed to embed secret information through prompt mapping and batch mapping. The prompt mapping maps secret information into a conditional prompt to guide the pre-trained diffusion model generating batches of candidate sentences. The batch mapping selects stego text based on secret information from batches of candidate sentences. Extensive experiments show that the GTSD outperforms the SOTA method in terms of generative speed, robustness, and imperceptibility while maintaining comparable anti-steganalysis performance. Moreover, we verify that the GTSD has strong potential: embedding capacity is positively correlated with prompt capacity and model batch sizes while maintaining security.
Related papers
- Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models [25.52890764952079]
steganography offers a way to securely hide messages within innocent-looking texts.<n>Large Language Models (LLMs) provide high-quality and explicit distribution.<n>ShiMer pseudorandomly shifts the probability interval of the LLM's distribution to obtain a private distribution.
arXiv Detail & Related papers (2025-01-01T09:51:15Z) - FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios [0.0]
Linguistic steganography embeds secret information in seemingly innocent texts, safeguarding privacy in surveillance environments.<n>We propose FreStega, a plug-and-play method to reconstruct the distribution of language models used for generative linguistic steganography.
arXiv Detail & Related papers (2024-12-27T13:56:51Z) - DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak [51.8218217407928]
Large Language Models (LLMs) are susceptible to generating harmful content when prompted with carefully crafted inputs.
This paper introduces DiffusionAttacker, an end-to-end generative approach for jailbreak rewriting inspired by diffusion models.
arXiv Detail & Related papers (2024-12-23T12:44:54Z) - Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions.
Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step.
Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z) - Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.<n>We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.<n> SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Provably Secure Generative Linguistic Steganography [29.919406917681282]
We present a novel provably secure generative linguistic steganographic method ADG.
ADG embeds secret information by Adaptive Dynamic Grouping of tokens according to their probability given by an off-the-shelf language model.
arXiv Detail & Related papers (2021-06-03T17:27:10Z) - BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images)
We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples.
Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.