Near-imperceptible Neural Linguistic Steganography via Self-Adjusting
Arithmetic Coding
- URL: http://arxiv.org/abs/2010.00677v1
- Date: Thu, 1 Oct 2020 20:40:23 GMT
- Title: Near-imperceptible Neural Linguistic Steganography via Self-Adjusting
Arithmetic Coding
- Authors: Jiaming Shen and Heng Ji and Jiawei Han
- Abstract summary: We present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model.
Human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.
- Score: 88.31226340759892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linguistic steganography studies how to hide secret messages in natural
language cover texts. Traditional methods aim to transform a secret message
into an innocent text via lexical substitution or syntactical modification.
Recently, advances in neural language models (LMs) enable us to directly
generate cover text conditioned on the secret message. In this study, we
present a new linguistic steganography method which encodes secret messages
using self-adjusting arithmetic coding based on a neural language model. We
formally analyze the statistical imperceptibility of this method and
empirically show it outperforms the previous state-of-the-art methods on four
datasets by 15.3% and 38.9% in terms of bits/word and KL metrics, respectively.
Finally, human evaluations show that 51% of generated cover texts can indeed
fool eavesdroppers.
Related papers
- Zero-shot Generative Linguistic Steganography [31.19052670719132]
We propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility.
Our experimental results indicate that our method produces $1.926times$ more innocent and intelligible stegotext than any other method.
arXiv Detail & Related papers (2024-03-16T08:31:25Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Hiding in Plain Sight: Towards the Science of Linguistic Steganography [0.0]
Covert communication (also known as steganography) is the practice of concealing a secret inside an innocuous-looking public object (covert code)
Linguistic steganography is the practice of encoding a secret message in natural language text such as spoken conversation or short public communications such as tweets.
arXiv Detail & Related papers (2023-12-28T06:00:55Z) - Reverse-Engineering Decoding Strategies Given Blackbox Access to a
Language Generation System [73.52878118434147]
We present methods to reverse-engineer the decoding method used to generate text.
Our ability to discover which decoding strategy was used has implications for detecting generated text.
arXiv Detail & Related papers (2023-09-09T18:19:47Z) - DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of
GPT-Generated Text [82.5469544192645]
We propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT)
By analyzing the differences between the original and new remaining parts through N-gram analysis, we unveil significant discrepancies between the distribution of machine-generated text and human-written text.
Results show that our zero-shot approach exhibits state-of-the-art performance in distinguishing between human and GPT-generated text.
arXiv Detail & Related papers (2023-05-27T03:58:29Z) - Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot
Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks.
In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks.
Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Exploiting Language Model for Efficient Linguistic Steganalysis: An
Empirical Study [23.311007481830647]
We present two methods to efficient linguistic steganalysis.
One is to pre-train a language model based on RNN, and the other is to pre-train a sequence autoencoder.
arXiv Detail & Related papers (2021-07-26T12:37:18Z) - Provably Secure Generative Linguistic Steganography [29.919406917681282]
We present a novel provably secure generative linguistic steganographic method ADG.
ADG embeds secret information by Adaptive Dynamic Grouping of tokens according to their probability given by an off-the-shelf language model.
arXiv Detail & Related papers (2021-06-03T17:27:10Z) - Graph-Stega: Semantic Controllable Steganographic Text Generation Guided
by Knowledge Graph [29.189037080306353]
This paper proposes a new text generative steganography method which is quietly different from the existing models.
We use a Knowledge Graph (KG) to guide the generation of steganographic sentences.
The experimental results show that the proposed model can guarantee both the quality of the generated text and its semantic expression.
arXiv Detail & Related papers (2020-06-02T06:53:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.