Hiding in Plain Sight: Towards the Science of Linguistic Steganography
- URL: http://arxiv.org/abs/2312.16840v1
- Date: Thu, 28 Dec 2023 06:00:55 GMT
- Title: Hiding in Plain Sight: Towards the Science of Linguistic Steganography
- Authors: Leela Raj-Sankar and S. Raj Rajagopalan
- Abstract summary: Covert communication (also known as steganography) is the practice of concealing a secret inside an innocuous-looking public object (covert code)
Linguistic steganography is the practice of encoding a secret message in natural language text such as spoken conversation or short public communications such as tweets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Covert communication (also known as steganography) is the practice of
concealing a secret inside an innocuous-looking public object (cover) so that
the modified public object (covert code) makes sense to everyone but only
someone who knows the code can extract the secret (message). Linguistic
steganography is the practice of encoding a secret message in natural language
text such as spoken conversation or short public communications such as
tweets.. While ad hoc methods for covert communications in specific domains
exist ( JPEG images, Chinese poetry, etc), there is no general model for
linguistic steganography specifically. We present a novel mathematical
formalism for creating linguistic steganographic codes, with three parameters:
Decodability (probability that the receiver of the coded message will decode
the cover correctly), density (frequency of code words in a cover code), and
detectability (probability that an attacker can tell the difference between an
untampered cover compared to its steganized version). Verbal or linguistic
steganography is most challenging because of its lack of artifacts to hide the
secret message in. We detail a practical construction in Python of a
steganographic code for Tweets using inserted words to encode hidden digits
while using n-gram frequency distortion as the measure of detectability of the
insertions. Using the publicly accessible Stanford Sentiment Analysis dataset
we implemented the tweet steganization scheme -- a codeword (an existing word
in the data set) inserted in random positions in random existing tweets to find
the tweet that has the least possible n-gram distortion. We argue that this
approximates KL distance in a localized manner at low cost and thus we get a
linguistic steganography scheme that is both formal and practical and permits a
tradeoff between codeword density and detectability of the covert message.
Related papers
- Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.
We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.
SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z) - Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics [50.982315553104975]
We investigate the bottom-up evolution of lexical semantics for a popular large language model, namely Llama2.
Our experiments show that the representations in lower layers encode lexical semantics, while the higher layers, with weaker semantic induction, are responsible for prediction.
This is in contrast to models with discriminative objectives, such as mask language modeling, where the higher layers obtain better lexical semantics.
arXiv Detail & Related papers (2024-03-03T13:14:47Z) - Quantum Steganography via Coherent and Fock State Encoding in an Optical
Medium [0.0]
Steganography is an alternative to cryptography, where information is protected by secrecy.
We develop schemes for steganographic communication using Fock and coherent states in optical channels based on disguising the communications as thermal noise.
arXiv Detail & Related papers (2023-03-04T03:18:28Z) - Hiding Images in Deep Probabilistic Models [58.23127414572098]
We describe a different computational framework to hide images in deep probabilistic models.
Specifically, we use a DNN to model the probability density of cover images, and hide a secret image in one particular location of the learned distribution.
We demonstrate the feasibility of our SinGAN approach in terms of extraction accuracy and model security.
arXiv Detail & Related papers (2022-10-05T13:33:25Z) - Deniable Steganography [30.729865153060985]
Steganography conceals the secret message into the cover media, generating a stego media which can be transmitted on public channels without drawing suspicion.
As its countermeasure, steganalysis mainly aims to detect whether the secret message is hidden in a given media.
We propose a receiver-deniable steganographic scheme to deal with the receiver-side coercive attack using deep neural networks (DNN)
arXiv Detail & Related papers (2022-05-25T09:00:30Z) - Semantic-Preserving Linguistic Steganography by Pivot Translation and
Semantic-Aware Bins Coding [45.13432859384438]
Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication.
We propose a novel LS method to modify a given text by pivoting it between two different languages.
arXiv Detail & Related papers (2022-03-08T01:35:05Z) - Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate.
Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance.
Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z) - News Image Steganography: A Novel Architecture Facilitates the Fake News
Identification [52.83247667841588]
A larger portion of fake news quotes untampered images from other sources with ulterior motives.
This paper proposes an architecture named News Image Steganography to reveal the inconsistency through image steganography based on GAN.
arXiv Detail & Related papers (2021-01-03T11:12:23Z) - Differential Privacy and Natural Language Processing to Generate
Contextually Similar Decoy Messages in Honey Encryption Scheme [0.0]
Honey Encryption is an approach to encrypt the messages using low min-entropy keys, such as weak passwords, OTPs, PINs, credit card numbers.
The ciphertext is produces, when decrypted with any number of incorrect keys, produces plausible-looking but bogus plaintext called "honey messages"
A gibberish, random assortment of words is not enough to fool an attacker; that will not be acceptable and convincing, whether or not the attacker knows some information of the genuine source.
arXiv Detail & Related papers (2020-10-29T23:02:32Z) - Near-imperceptible Neural Linguistic Steganography via Self-Adjusting
Arithmetic Coding [88.31226340759892]
We present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model.
Human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.
arXiv Detail & Related papers (2020-10-01T20:40:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.