Plaintext Structure Vulnerability: Robust Cipher Identification via a Distributional Randomness Fingerprint Feature Extractor
- URL: http://arxiv.org/abs/2511.08296v1
- Date: Wed, 12 Nov 2025 01:51:35 GMT
- Title: Plaintext Structure Vulnerability: Robust Cipher Identification via a Distributional Randomness Fingerprint Feature Extractor
- Authors: Xiwen Ren, Min Luo, Cong Peng, Debiao He,
- Abstract summary: We present a method that does not learn end-to-end from ciphertext bytes.<n>Specifically, this method is based on a set of statistical tests to compute the randomness feature of the ciphertext.<n>The experimental results demonstrate that our method achieves high discriminative performance.
- Score: 23.713094083283334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern encryption algorithms form the foundation of digital security. However, the widespread use of encryption algorithms results in significant challenges for network defenders in identifying which specific algorithms are being employed. More importantly, we find that when the plaintext distribution of test data departs from the training data, the performance of classifiers often declines significantly. This issue exposes the feature extractor's hidden dependency on plaintext features. To reduce this dependency, we adopt a method that does not learn end-to-end from ciphertext bytes. Specifically, this method is based on a set of statistical tests to compute the randomness feature of the ciphertext, and then uses the frequency distribution pattern of this feature to construct the algorithms' respective fingerprints. The experimental results demonstrate that our method achieves high discriminative performance (e.g., AUC > 0.98) in the Canterbury Corpus dataset, which contains a diverse set of data types. Furthermore, in our cross-domain evaluation, baseline models' performance degrades significantly when tested on data with a reduced proportion of structured plaintext. In sharp contrast, our method demonstrates high robustness: performance degradation is minimal when transferring between different structured domains, and even on the most challenging purely random dataset, it maintains a high level of ranking ability (AUC > 0.90).
Related papers
- LRANet++: Low-Rank Approximation Network for Accurate and Efficient Text Spotting [118.93173826110815]
We propose a novel parameterized text shape method based on low-rank approximation for precise detection.<n>By exploiting the inherent shape correlation among different text contours, our method achieves consistency and compactness in shape representation.<n>We integrate the enhanced detection module with a lightweight recognition branch to form an end-to-end text spotting framework, termed LRANet++.
arXiv Detail & Related papers (2025-11-08T03:08:03Z) - Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection [71.59834293521074]
We develop a framework to distinguish between human-authored and machine-generated text.<n>Our method achieves 98.3% AUROC and AUPR with only 8.9% FPR95 on DeepFake dataset.<n>Code, pretrained weights, and demo will be released.
arXiv Detail & Related papers (2025-10-07T08:14:45Z) - Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z) - Using Modular Arithmetic Optimized Neural Networks To Crack Affine Cryptographic Schemes Efficiently [0.27309692684728615]
We investigate the cryptanalysis of affine ciphers using a hybrid neural network architecture.<n>Our approach integrates a modular branch that processes raw ciphertext sequences and a statistical branch that leverages letter frequency features.
arXiv Detail & Related papers (2025-07-17T04:54:10Z) - Feature Homomorphism -- A Cryptographic Scheme For Data Verification Under Ciphertext-Only Conditions [0.0]
This paper proposes a new type of homomorphism: Feature Homomorphism.
based on this feature, introduces a cryptographic scheme for data verification under ciphertext-only conditions.
The proposed scheme involves designing a group of algorithms that meet the requirements outlined in this paper.
arXiv Detail & Related papers (2024-10-22T15:30:24Z) - Contextual Model Aggregation for Fast and Robust Federated Learning in
Edge Computing [88.76112371510999]
Federated learning is a prime candidate for distributed machine learning at the network edge.
Existing algorithms face issues with slow convergence and/or robustness of performance.
We propose a contextual aggregation scheme that achieves the optimal context-dependent bound on loss reduction.
arXiv Detail & Related papers (2022-03-23T21:42:31Z) - A Fast Randomized Algorithm for Massive Text Normalization [26.602776972067936]
We present FLAN, a scalable randomized algorithm to clean and canonicalize massive text data.
Our algorithm relies on the Jaccard similarity between words to suggest correction results.
Our experimental results on real-world datasets demonstrate the efficiency and efficacy of FLAN.
arXiv Detail & Related papers (2021-10-06T19:18:17Z) - An Analysis of Robustness of Non-Lipschitz Networks [35.64511156980701]
Small input perturbations can often produce large movements in the network's final-layer feature space.
In our model, the adversary may move data an arbitrary distance in feature space but only in random low-dimensional subspaces.
We provide theoretical guarantees for setting algorithm parameters to optimize over accuracy-abstention trade-offs using data-driven methods.
arXiv Detail & Related papers (2020-10-13T03:56:39Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.