Related papers: Hiding Sensitive Information Using PDF Steganography

Hiding Sensitive Information Using PDF Steganography

URL: http://arxiv.org/abs/2405.00865v1
Date: Wed, 1 May 2024 20:54:12 GMT
Title: Hiding Sensitive Information Using PDF Steganography
Authors: Ryan Klemm, Bo Chen,
Abstract summary: We present a novel PDF steganography algorithm based upon least-significant bit insertion into the real-valued operands of PDF stream operators. We also provide a case study which embeds malware into a given cover PDF document.
Score: 3.6533698604619587
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The use of steganography to transmit secret data is becoming increasingly common in security products and malware today. Despite being extremely popular, PDF files are not often the focus of steganography research, as most applications utilize digital image, audio, and video files as their cover data. However, the PDF file format is promising for usage in medium-capacity steganography applications. In this paper, we present a novel PDF steganography algorithm based upon least-significant bit insertion into the real-valued operands of PDF stream operators. Where prior research has only considered a small subset of these operators, we take an extensive look at all the possible operators defined in the Adobe PDF standard to evaluate their usability in our steganography algorithm. We also provide a case study which embeds malware into a given cover PDF document.

Related papers

Analyzing PDFs like Binaries: Adversarially Robust PDF Malware Analysis via Intermediate Representation and Language Model [27.85605747467984]
Malicious PDF files have emerged as a persistent threat and become a popular attack vector in web-based attacks.<n> PDF malwares are often susceptible to adversarial attacks, undermining their reliability.<n>We propose a novel approach for PDF feature extraction and PDF malware detection.
arXiv Detail & Related papers (2025-06-20T17:08:08Z)
In-Context Watermarks for Large Language Models [71.29952527565749]
In-Context Watermarking (ICW) embeds watermarks into generated text solely through prompt engineering.<n>We investigate four ICW strategies at different levels of granularity, each paired with a tailored detection method.<n>Our experiments validate the feasibility of ICW as a model-agnostic, practical watermarking approach.
arXiv Detail & Related papers (2025-05-22T17:24:51Z)
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling [63.93112754821312]
Document understanding is a challenging task to process and comprehend large amounts of textual and visual information. Recent advances in Large Language Models (LLMs) have significantly improved the performance of this task. We introduce PDF-WuKong, a multimodal large language model (MLLM) which is designed to enhance multimodal question-answering (QA) for long PDF documents.
arXiv Detail & Related papers (2024-10-08T12:17:42Z)
Harnessing Lightweight Ciphers for PDF Encryption [1.104960878651584]
Portable Document Format (PDF) is used worldwide as de-facto standard for exchanging documents. At present, PDF encryption only supports Advanced Encryption Standard (AES) to encrypt and decrypt information. Lightweight Cryptography, which is referred to as crypto for resource constrained environments has gained lot of popularity.
arXiv Detail & Related papers (2024-09-14T12:59:04Z)
Optimizing Nepali PDF Extraction: A Comparative Study of Parser and OCR Technologies [0.0]
This research compares PDF parsing and Optical Character Recognition (OCR) methods for extracting Nepali content from PDFs. OCR, specifically PyTesseract, overcomes challenges with non-Unicode Nepali fonts. Considering the project's emphasis on Nepali PDFs, PyTesseract emerges as the most suitable library.
arXiv Detail & Related papers (2024-07-05T15:12:14Z)
An Extensive Survey of Digital Image Steganography: State of the Art [0.0]
The need to protect sensitive information privacy duringinformation exchange over the internet/intranet has led to wide adoption of cryptography and steganography. This paper critically analyzes the current steganographic techniques, recent trends, and challenges.
arXiv Detail & Related papers (2024-04-30T13:16:24Z)
A Feature Set of Small Size for the PDF Malware Detection [8.282177703075451]
We propose a small features set that don't require too much domain knowledge of the PDF file. We report the best accuracy of 99.75% when using Random Forest model. Despite its modest size, we obtain comparable results to state-of-the-art that employ a much larger set of features.
arXiv Detail & Related papers (2023-08-09T04:51:28Z)
Attention Consistency Refined Masked Frequency Forgery Representation for Generalizing Face Forgery Detection [96.539862328788]
Existing forgery detection methods suffer from unsatisfactory generalization ability to determine the authenticity in the unseen domain. We propose a novel Attention Consistency Refined masked frequency forgery representation model toward generalizing face forgery detection algorithm (ACMF) Experiment results on several public face forgery datasets demonstrate the superior performance of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2023-07-21T08:58:49Z)
Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data. By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders. This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z)
Perfectly Secure Steganography Using Minimum Entropy Coupling [60.154855689780796]
We show that a steganography procedure is perfectly secure under Cachin 1998's information-theoretic model of steganography. We also show that, among perfectly secure procedures, a procedure maximizes information throughput if and only if it is induced by a minimum entropy coupling.
arXiv Detail & Related papers (2022-10-24T17:40:07Z)
Watermarking Images in Self-Supervised Latent Spaces [75.99287942537138]
We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches. We present a way to embed both marks and binary messages into their latent spaces, leveraging data augmentation at marking time.
arXiv Detail & Related papers (2021-12-17T15:52:46Z)
PDF-Malware: An Overview on Threats, Detection and Evasion Attacks [0.966840768820136]
The widespread use of PDF has installed a false impression of inherent safety among benign users. In this work, we give an overview on the PDF-malware detection problem.
arXiv Detail & Related papers (2021-07-27T15:15:20Z)
Detecting malicious PDF using CNN [46.86114958340962]
Malicious PDF files represent one of the biggest threats to computer security. We propose a novel algorithm that uses an ensemble of Convolutional Neural Network (CNN) on the byte level of the file. We show, using a data set of 90000 files downloadable online, that our approach maintains a high detection rate (94%) of PDF malware.
arXiv Detail & Related papers (2020-07-24T18:27:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.