Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models
- URL: http://arxiv.org/abs/2501.00786v1
- Date: Wed, 01 Jan 2025 09:51:15 GMT
- Title: Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models
- Authors: Minhao Bai, Jinshuai Yang, Kaiyi Pang, Yongfeng Huang, Yue Gao,
- Abstract summary: steganography offers a way to securely hide messages within innocent-looking texts.
Large Language Models (LLMs) provide high-quality and explicit distribution.
ShiMer pseudorandomly shifts the probability interval of the LLM's distribution to obtain a private distribution.
- Score: 25.52890764952079
- License:
- Abstract: In the face of escalating surveillance and censorship within the cyberspace, the sanctity of personal privacy has come under siege, necessitating the development of steganography, which offers a way to securely hide messages within innocent-looking texts. Previous methods alternate the texts to hide private massages, which is not secure. Large Language Models (LLMs) provide high-quality and explicit distribution, which is an available mathematical tool for secure steganography methods. However, existing attempts fail to achieve high capacity, time efficiency and correctness simultaneously, and their strongly coupling designs leave little room for refining them to achieve better performance. To provide a secure, high-capacity and efficient steganography method, we introduce ShiMer. Specifically, ShiMer pseudorandomly shifts the probability interval of the LLM's distribution to obtain a private distribution, and samples a token according to the private bits. ShiMer produced steganographic texts are indistinguishable in quality from the normal texts directly generated by the language model. To further enhance the capacity of ShiMer, we design a reordering algorithm to minimize the occurrence of interval splitting during decoding phase. Experimental results indicate that our method achieves the highest capacity and efficiency among existing secure steganography techniques.
Related papers
- GaussMark: A Practical Approach for Structural Watermarking of Language Models [61.84270985214254]
GaussMark is a simple, efficient, and relatively robust scheme for watermarking large language models.
We show that GaussMark is reliable, efficient, and relatively robust to corruptions such as insertions, deletions, substitutions, and roundtrip translations.
arXiv Detail & Related papers (2025-01-17T22:30:08Z) - FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios [0.0]
Linguistic steganography embeds secret information in seemingly innocent texts, safeguarding privacy in surveillance environments.
We propose FreStega, a plug-and-play method to reconstruct the distribution of language models used for generative linguistic steganography.
arXiv Detail & Related papers (2024-12-27T13:56:51Z) - Semantic Steganography: A Framework for Robust and High-Capacity Information Hiding using Large Language Models [25.52890764952079]
generative linguistic steganography has become a prevalent technique for hiding information within model-generated texts.
We propose a semantic steganography framework based on Large Language Models (LLMs)
This framework offers robustness and reliability for transmission in complex channels, as well as resistance to text rendering and word blocking.
arXiv Detail & Related papers (2024-12-15T04:04:23Z) - Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions.
Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step.
Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z) - Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.
We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.
SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Perfectly Secure Steganography Using Minimum Entropy Coupling [60.154855689780796]
We show that a steganography procedure is perfectly secure under Cachin 1998's information-theoretic model of steganography.
We also show that, among perfectly secure procedures, a procedure maximizes information throughput if and only if it is induced by a minimum entropy coupling.
arXiv Detail & Related papers (2022-10-24T17:40:07Z) - Autoregressive Linguistic Steganography Based on BERT and Consistency
Coding [17.881686153284267]
Linguistic steganography (LS) conceals the presence of communication by embedding secret information into a text.
Recent algorithms use a language model (LM) to generate the steganographic text, which provides a higher payload compared with many previous arts.
We propose a novel autoregressive LS algorithm based on BERT and consistency coding, which achieves a better trade-off between embedding payload and system security.
arXiv Detail & Related papers (2022-03-26T02:36:55Z) - Provably Secure Generative Linguistic Steganography [29.919406917681282]
We present a novel provably secure generative linguistic steganographic method ADG.
ADG embeds secret information by Adaptive Dynamic Grouping of tokens according to their probability given by an off-the-shelf language model.
arXiv Detail & Related papers (2021-06-03T17:27:10Z) - BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images)
We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples.
Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.