Autoregressive Linguistic Steganography Based on BERT and Consistency
Coding
- URL: http://arxiv.org/abs/2203.13972v1
- Date: Sat, 26 Mar 2022 02:36:55 GMT
- Title: Autoregressive Linguistic Steganography Based on BERT and Consistency
Coding
- Authors: Xiaoyan Zheng and Hanzhou Wu
- Abstract summary: Linguistic steganography (LS) conceals the presence of communication by embedding secret information into a text.
Recent algorithms use a language model (LM) to generate the steganographic text, which provides a higher payload compared with many previous arts.
We propose a novel autoregressive LS algorithm based on BERT and consistency coding, which achieves a better trade-off between embedding payload and system security.
- Score: 17.881686153284267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linguistic steganography (LS) conceals the presence of communication by
embedding secret information into a text. How to generate a high-quality text
carrying secret information is a key problem. With the widespread application
of deep learning in natural language processing, recent algorithms use a
language model (LM) to generate the steganographic text, which provides a
higher payload compared with many previous arts. However, the security still
needs to be enhanced. To tackle with this problem, we propose a novel
autoregressive LS algorithm based on BERT and consistency coding, which
achieves a better trade-off between embedding payload and system security. In
the proposed work, based on the introduction of the masked LM, given a text, we
use consistency coding to make up for the shortcomings of block coding used in
the previous work so that we can encode arbitrary-size candidate token set and
take advantages of the probability distribution for information hiding. The
masked positions to be embedded are filled with tokens determined by an
autoregressive manner to enhance the connection between contexts and therefore
maintain the quality of the text. Experimental results have shown that,
compared with related works, the proposed work improves the fluency of the
steganographic text while guaranteeing security, and also increases the
embedding payload to a certain extent.
Related papers
- Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques [0.0]
This research paper developed a novel approach to improve text generation in the context of joint Natural Language Generation (NLG) and Natural Language Understanding (NLU) learning.
The data is prepared by gathering and preprocessing annotated datasets, including cleaning, tokenization, stemming, and stop-word removal.
Transformer-based encoders and decoders, capturing long range dependencies and improving source-target sequence modelling.
Reinforcement learning with policy gradient techniques, semi-supervised training, improved attention mechanisms, and differentiable approximations are employed to fine-tune the models and handle complex linguistic tasks effectively.
arXiv Detail & Related papers (2024-10-17T12:43:49Z) - Towards Codable Watermarking for Injecting Multi-bits Information to LLMs [86.86436777626959]
Large language models (LLMs) generate texts with increasing fluency and realism.
Existing watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs.
We propose Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information.
arXiv Detail & Related papers (2023-07-29T14:11:15Z) - Revisiting the Roles of "Text" in Text Games [102.22750109468652]
This paper investigates the roles of text in the face of different reinforcement learning challenges.
We propose a simple scheme to extract relevant contextual information into an approximate state hash.
Such a lightweight plug-in achieves competitive performance with state-of-the-art text agents.
arXiv Detail & Related papers (2022-10-15T21:52:39Z) - Privacy-Preserving Text Classification on BERT Embeddings with
Homomorphic Encryption [23.010346603025255]
We propose a privatization mechanism for embeddings based on homomorphic encryption.
We show that our method offers encrypted protection of BERT embeddings, while largely preserving their utility on downstream text classification tasks.
arXiv Detail & Related papers (2022-10-05T21:46:02Z) - ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval.
It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding.
We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z) - Semantic-Preserving Linguistic Steganography by Pivot Translation and
Semantic-Aware Bins Coding [45.13432859384438]
Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication.
We propose a novel LS method to modify a given text by pivoting it between two different languages.
arXiv Detail & Related papers (2022-03-08T01:35:05Z) - Text Compression-aided Transformer Encoding [77.16960983003271]
We propose explicit and implicit text compression approaches to enhance the Transformer encoding.
backbone information, meaning the gist of the input text, is not specifically focused on.
Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.
arXiv Detail & Related papers (2021-02-11T11:28:39Z) - Rethinking Positional Encoding in Language Pre-training [111.2320727291926]
We show that in absolute positional encoding, the addition operation applied on positional embeddings and word embeddings brings mixed correlations.
We propose a new positional encoding method called textbfTransformer with textbfUntied textPositional textbfEncoding (T)
arXiv Detail & Related papers (2020-06-28T13:11:02Z) - Hybrid Attention-Based Transformer Block Model for Distant Supervision
Relation Extraction [20.644215991166902]
We propose a new framework using hybrid attention-based Transformer block with multi-instance learning to perform the DSRE task.
The proposed approach can outperform the state-of-the-art algorithms on the evaluation dataset.
arXiv Detail & Related papers (2020-03-10T13:05:52Z) - TEDL: A Text Encryption Method Based on Deep Learning [10.428079716944463]
This paper proposes a novel text encryption method based on deep learning called TEDL.
Results of experiments and relevant analyses show that TEDL performs well for security, efficiency, generality, and has a lower demand for the frequency of key redistribution.
arXiv Detail & Related papers (2020-03-09T11:04:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.