Hiding Sensitive Information Using PDF Steganography
- URL: http://arxiv.org/abs/2405.00865v1
- Date: Wed, 1 May 2024 20:54:12 GMT
- Title: Hiding Sensitive Information Using PDF Steganography
- Authors: Ryan Klemm, Bo Chen,
- Abstract summary: We present a novel PDF steganography algorithm based upon least-significant bit insertion into the real-valued operands of PDF stream operators.
We also provide a case study which embeds malware into a given cover PDF document.
- Score: 3.6533698604619587
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of steganography to transmit secret data is becoming increasingly common in security products and malware today. Despite being extremely popular, PDF files are not often the focus of steganography research, as most applications utilize digital image, audio, and video files as their cover data. However, the PDF file format is promising for usage in medium-capacity steganography applications. In this paper, we present a novel PDF steganography algorithm based upon least-significant bit insertion into the real-valued operands of PDF stream operators. Where prior research has only considered a small subset of these operators, we take an extensive look at all the possible operators defined in the Adobe PDF standard to evaluate their usability in our steganography algorithm. We also provide a case study which embeds malware into a given cover PDF document.
Related papers
- PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling [63.93112754821312]
Document understanding is a challenging task to process and comprehend large amounts of textual and visual information.
Recent advances in Large Language Models (LLMs) have significantly improved the performance of this task.
We introduce PDF-WuKong, a multimodal large language model (MLLM) which is designed to enhance multimodal question-answering (QA) for long PDF documents.
arXiv Detail & Related papers (2024-10-08T12:17:42Z) - Harnessing Lightweight Ciphers for PDF Encryption [1.104960878651584]
Portable Document Format (PDF) is used worldwide as de-facto standard for exchanging documents.
At present, PDF encryption only supports Advanced Encryption Standard (AES) to encrypt and decrypt information.
Lightweight Cryptography, which is referred to as crypto for resource constrained environments has gained lot of popularity.
arXiv Detail & Related papers (2024-09-14T12:59:04Z) - An Extensive Survey of Digital Image Steganography: State of the Art [0.0]
The need to protect sensitive information privacy duringinformation exchange over the internet/intranet has led to wide adoption of cryptography and steganography.
This paper critically analyzes the current steganographic techniques, recent trends, and challenges.
arXiv Detail & Related papers (2024-04-30T13:16:24Z) - A Feature Set of Small Size for the PDF Malware Detection [8.282177703075451]
We propose a small features set that don't require too much domain knowledge of the PDF file.
We report the best accuracy of 99.75% when using Random Forest model.
Despite its modest size, we obtain comparable results to state-of-the-art that employ a much larger set of features.
arXiv Detail & Related papers (2023-08-09T04:51:28Z) - Attention Consistency Refined Masked Frequency Forgery Representation
for Generalizing Face Forgery Detection [96.539862328788]
Existing forgery detection methods suffer from unsatisfactory generalization ability to determine the authenticity in the unseen domain.
We propose a novel Attention Consistency Refined masked frequency forgery representation model toward generalizing face forgery detection algorithm (ACMF)
Experiment results on several public face forgery datasets demonstrate the superior performance of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2023-07-21T08:58:49Z) - Did You Train on My Dataset? Towards Public Dataset Protection with
Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data.
By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders.
This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z) - Perfectly Secure Steganography Using Minimum Entropy Coupling [60.154855689780796]
We show that a steganography procedure is perfectly secure under Cachin 1998's information-theoretic model of steganography.
We also show that, among perfectly secure procedures, a procedure maximizes information throughput if and only if it is induced by a minimum entropy coupling.
arXiv Detail & Related papers (2022-10-24T17:40:07Z) - Watermarking Images in Self-Supervised Latent Spaces [75.99287942537138]
We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches.
We present a way to embed both marks and binary messages into their latent spaces, leveraging data augmentation at marking time.
arXiv Detail & Related papers (2021-12-17T15:52:46Z) - HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and
Statistical Analysis [16.224649756613655]
Malicious PDF documents present a serious threat to various security organizations.
State-of-the-art approaches use machine learning (ML) to learn features that characterize PDF malware.
In this paper, we derive a simple yet effective holistic approach to PDF malware detection.
arXiv Detail & Related papers (2021-11-08T18:32:47Z) - PDF-Malware: An Overview on Threats, Detection and Evasion Attacks [0.966840768820136]
The widespread use of PDF has installed a false impression of inherent safety among benign users.
In this work, we give an overview on the PDF-malware detection problem.
arXiv Detail & Related papers (2021-07-27T15:15:20Z) - Detecting malicious PDF using CNN [46.86114958340962]
Malicious PDF files represent one of the biggest threats to computer security.
We propose a novel algorithm that uses an ensemble of Convolutional Neural Network (CNN) on the byte level of the file.
We show, using a data set of 90000 files downloadable online, that our approach maintains a high detection rate (94%) of PDF malware.
arXiv Detail & Related papers (2020-07-24T18:27:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.