Related papers: Autoregressive Linguistic Steganography Based on BERT and Consistency Coding

Autoregressive Linguistic Steganography Based on BERT and Consistency Coding

URL: http://arxiv.org/abs/2203.13972v1
Date: Sat, 26 Mar 2022 02:36:55 GMT
Title: Autoregressive Linguistic Steganography Based on BERT and Consistency Coding
Authors: Xiaoyan Zheng and Hanzhou Wu
Abstract summary: Linguistic steganography (LS) conceals the presence of communication by embedding secret information into a text. Recent algorithms use a language model (LM) to generate the steganographic text, which provides a higher payload compared with many previous arts. We propose a novel autoregressive LS algorithm based on BERT and consistency coding, which achieves a better trade-off between embedding payload and system security.
Score: 17.881686153284267
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Linguistic steganography (LS) conceals the presence of communication by embedding secret information into a text. How to generate a high-quality text carrying secret information is a key problem. With the widespread application of deep learning in natural language processing, recent algorithms use a language model (LM) to generate the steganographic text, which provides a higher payload compared with many previous arts. However, the security still needs to be enhanced. To tackle with this problem, we propose a novel autoregressive LS algorithm based on BERT and consistency coding, which achieves a better trade-off between embedding payload and system security. In the proposed work, based on the introduction of the masked LM, given a text, we use consistency coding to make up for the shortcomings of block coding used in the previous work so that we can encode arbitrary-size candidate token set and take advantages of the probability distribution for information hiding. The masked positions to be embedded are filled with tokens determined by an autoregressive manner to enhance the connection between contexts and therefore maintain the quality of the text. Experimental results have shown that, compared with related works, the proposed work improves the fluency of the steganographic text while guaranteeing security, and also increases the embedding payload to a certain extent.

Related papers

Bridging Textual-Collaborative Gap through Semantic Codes for Sequential Recommendation [91.13055384151897]
CoCoRec is a novel Code-based textual and Collaborative semantic fusion method for sequential Recommendation. We generate fine-grained semantic codes from multi-view text embeddings through vector quantization techniques. In order to further enhance the fusion of textual and collaborative semantics, we introduce an optimization strategy.
arXiv Detail & Related papers (2025-03-15T15:54:44Z)
Secure Semantic Communication With Homomorphic Encryption [52.5344514499035]
This paper explores the feasibility of applying homomorphic encryption to SemCom. We propose a task-oriented SemCom scheme secured through homomorphic encryption.
arXiv Detail & Related papers (2025-01-17T13:26:14Z)
Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models [25.52890764952079]
steganography offers a way to securely hide messages within innocent-looking texts. Large Language Models (LLMs) provide high-quality and explicit distribution. ShiMer pseudorandomly shifts the probability interval of the LLM's distribution to obtain a private distribution.
arXiv Detail & Related papers (2025-01-01T09:51:15Z)
Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques [0.0]
This research paper developed a novel approach to improve text generation in the context of joint Natural Language Generation (NLG) and Natural Language Understanding (NLU) learning. The data is prepared by gathering and preprocessing annotated datasets, including cleaning, tokenization, stemming, and stop-word removal. Transformer-based encoders and decoders, capturing long range dependencies and improving source-target sequence modelling. Reinforcement learning with policy gradient techniques, semi-supervised training, improved attention mechanisms, and differentiable approximations are employed to fine-tune the models and handle complex linguistic tasks effectively.
arXiv Detail & Related papers (2024-10-17T12:43:49Z)
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism. Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs. We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z)
Revisiting the Roles of "Text" in Text Games [102.22750109468652]
This paper investigates the roles of text in the face of different reinforcement learning challenges. We propose a simple scheme to extract relevant contextual information into an approximate state hash. Such a lightweight plug-in achieves competitive performance with state-of-the-art text agents.
arXiv Detail & Related papers (2022-10-15T21:52:39Z)
Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption [23.010346603025255]
We propose a privatization mechanism for embeddings based on homomorphic encryption. We show that our method offers encrypted protection of BERT embeddings, while largely preserving their utility on downstream text classification tasks.
arXiv Detail & Related papers (2022-10-05T21:46:02Z)
ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval. It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding. We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z)
Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding [45.13432859384438]
Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication. We propose a novel LS method to modify a given text by pivoting it between two different languages.
arXiv Detail & Related papers (2022-03-08T01:35:05Z)
Text Compression-aided Transformer Encoding [77.16960983003271]
We propose explicit and implicit text compression approaches to enhance the Transformer encoding. backbone information, meaning the gist of the input text, is not specifically focused on. Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.
arXiv Detail & Related papers (2021-02-11T11:28:39Z)
Rethinking Positional Encoding in Language Pre-training [111.2320727291926]
We show that in absolute positional encoding, the addition operation applied on positional embeddings and word embeddings brings mixed correlations. We propose a new positional encoding method called textbfTransformer with textbfUntied textPositional textbfEncoding (T)
arXiv Detail & Related papers (2020-06-28T13:11:02Z)
Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction [20.644215991166902]
We propose a new framework using hybrid attention-based Transformer block with multi-instance learning to perform the DSRE task. The proposed approach can outperform the state-of-the-art algorithms on the evaluation dataset.
arXiv Detail & Related papers (2020-03-10T13:05:52Z)
TEDL: A Text Encryption Method Based on Deep Learning [10.428079716944463]
This paper proposes a novel text encryption method based on deep learning called TEDL. Results of experiments and relevant analyses show that TEDL performs well for security, efficiency, generality, and has a lower demand for the frequency of key redistribution.
arXiv Detail & Related papers (2020-03-09T11:04:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.