Related papers: Hiding in Plain Sight: Towards the Science of Linguistic Steganography

Hiding in Plain Sight: Towards the Science of Linguistic Steganography

URL: http://arxiv.org/abs/2312.16840v1
Date: Thu, 28 Dec 2023 06:00:55 GMT
Title: Hiding in Plain Sight: Towards the Science of Linguistic Steganography
Authors: Leela Raj-Sankar and S. Raj Rajagopalan
Abstract summary: Covert communication (also known as steganography) is the practice of concealing a secret inside an innocuous-looking public object (covert code) Linguistic steganography is the practice of encoding a secret message in natural language text such as spoken conversation or short public communications such as tweets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Covert communication (also known as steganography) is the practice of concealing a secret inside an innocuous-looking public object (cover) so that the modified public object (covert code) makes sense to everyone but only someone who knows the code can extract the secret (message). Linguistic steganography is the practice of encoding a secret message in natural language text such as spoken conversation or short public communications such as tweets.. While ad hoc methods for covert communications in specific domains exist ( JPEG images, Chinese poetry, etc), there is no general model for linguistic steganography specifically. We present a novel mathematical formalism for creating linguistic steganographic codes, with three parameters: Decodability (probability that the receiver of the coded message will decode the cover correctly), density (frequency of code words in a cover code), and detectability (probability that an attacker can tell the difference between an untampered cover compared to its steganized version). Verbal or linguistic steganography is most challenging because of its lack of artifacts to hide the secret message in. We detail a practical construction in Python of a steganographic code for Tweets using inserted words to encode hidden digits while using n-gram frequency distortion as the measure of detectability of the insertions. Using the publicly accessible Stanford Sentiment Analysis dataset we implemented the tweet steganization scheme -- a codeword (an existing word in the data set) inserted in random positions in random existing tweets to find the tweet that has the least possible n-gram distortion. We argue that this approximates KL distance in a localized manner at low cost and thus we get a linguistic steganography scheme that is both formal and practical and permits a tradeoff between codeword density and detectability of the covert message.

Related papers

Provably Secure Public-Key Steganography Based on Admissible Encoding [66.38591467056939]
The technique of hiding secret messages within seemingly harmless covertext is known as provably secure steganography (PSS) PSS evolves from symmetric key steganography to public-key steganography, functioning without the requirement of a pre-shared key. This paper proposes a more general elliptic curve public key steganography method based on admissible encoding.
arXiv Detail & Related papers (2025-04-28T03:42:25Z)
Robust Steganography from Large Language Models [1.5749416770494704]
We study the problem of robust steganography. We design and implement our steganographic schemes that embed arbitrary secret messages into natural language text.
arXiv Detail & Related papers (2025-04-11T21:06:36Z)
Innamark: A Whitespace Replacement Information-Hiding Method [0.0]
We introduce a novel method for information hiding called Innamark. Innamark can conceal any byte-encoded sequence within a sufficiently long cover text. We propose a specified structure for secret messages that enables compression, encryption, hashing, and error correction.
arXiv Detail & Related papers (2025-02-18T10:21:27Z)
Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures. We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem. SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z)
Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics [50.982315553104975]
We investigate the bottom-up evolution of lexical semantics for a popular large language model, namely Llama2. Our experiments show that the representations in lower layers encode lexical semantics, while the higher layers, with weaker semantic induction, are responsible for prediction. This is in contrast to models with discriminative objectives, such as mask language modeling, where the higher layers obtain better lexical semantics.
arXiv Detail & Related papers (2024-03-03T13:14:47Z)
Quantum Steganography via Coherent and Fock State Encoding in an Optical Medium [0.0]
Steganography is an alternative to cryptography, where information is protected by secrecy. We develop schemes for steganographic communication using Fock and coherent states in optical channels based on disguising the communications as thermal noise.
arXiv Detail & Related papers (2023-03-04T03:18:28Z)
Hiding Images in Deep Probabilistic Models [58.23127414572098]
We describe a different computational framework to hide images in deep probabilistic models. Specifically, we use a DNN to model the probability density of cover images, and hide a secret image in one particular location of the learned distribution. We demonstrate the feasibility of our SinGAN approach in terms of extraction accuracy and model security.
arXiv Detail & Related papers (2022-10-05T13:33:25Z)
Deniable Steganography [30.729865153060985]
Steganography conceals the secret message into the cover media, generating a stego media which can be transmitted on public channels without drawing suspicion. As its countermeasure, steganalysis mainly aims to detect whether the secret message is hidden in a given media. We propose a receiver-deniable steganographic scheme to deal with the receiver-side coercive attack using deep neural networks (DNN)
arXiv Detail & Related papers (2022-05-25T09:00:30Z)
Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding [45.13432859384438]
Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication. We propose a novel LS method to modify a given text by pivoting it between two different languages.
arXiv Detail & Related papers (2022-03-08T01:35:05Z)
Skeleton Based Sign Language Recognition Using Whole-body Keypoints [71.97020373520922]
Sign language is used by deaf or speech impaired people to communicate. Skeleton-based recognition is becoming popular that it can be further ensembled with RGB-D based method to achieve state-of-the-art performance. Inspired by the recent development of whole-body pose estimation citejin 2020whole, we propose recognizing sign language based on the whole-body key points and features.
arXiv Detail & Related papers (2021-03-16T03:38:17Z)
News Image Steganography: A Novel Architecture Facilitates the Fake News Identification [52.83247667841588]
A larger portion of fake news quotes untampered images from other sources with ulterior motives. This paper proposes an architecture named News Image Steganography to reveal the inconsistency through image steganography based on GAN.
arXiv Detail & Related papers (2021-01-03T11:12:23Z)
Differential Privacy and Natural Language Processing to Generate Contextually Similar Decoy Messages in Honey Encryption Scheme [0.0]
Honey Encryption is an approach to encrypt the messages using low min-entropy keys, such as weak passwords, OTPs, PINs, credit card numbers. The ciphertext is produces, when decrypted with any number of incorrect keys, produces plausible-looking but bogus plaintext called "honey messages" A gibberish, random assortment of words is not enough to fool an attacker; that will not be acceptable and convincing, whether or not the attacker knows some information of the genuine source.
arXiv Detail & Related papers (2020-10-29T23:02:32Z)
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding [88.31226340759892]
We present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model. Human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.
arXiv Detail & Related papers (2020-10-01T20:40:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.