Classifying World War II Era Ciphers with Machine Learning
- URL: http://arxiv.org/abs/2307.00501v2
- Date: Wed, 30 Aug 2023 13:02:41 GMT
- Title: Classifying World War II Era Ciphers with Machine Learning
- Authors: Brooke Dalton and Mark Stamp
- Abstract summary: We classify Enigma, M-209, Sigaba, Purple, and Typex ciphers from World War II era.
We find that classic machine learning models perform at least as well as deep learning models.
ciphers that are more similar in design are somewhat more challenging to distinguish, but not as difficult as might be expected.
- Score: 1.6317061277457
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We determine the accuracy with which machine learning and deep learning
techniques can classify selected World War II era ciphers when only ciphertext
is available. The specific ciphers considered are Enigma, M-209, Sigaba,
Purple, and Typex. We experiment with three classic machine learning models,
namely, Support Vector Machines (SVM), $k$-Nearest Neighbors ($k$-NN), and
Random Forest (RF). We also experiment with four deep learning neural
network-based models: Multi-Layer Perceptrons (MLP), Long Short-Term Memory
(LSTM), Extreme Learning Machines (ELM), and Convolutional Neural Networks
(CNN). Each model is trained on features consisting of histograms, digrams, and
raw ciphertext letter sequences. Furthermore, the classification problem is
considered under four distinct scenarios: Fixed plaintext with fixed keys,
random plaintext with fixed keys, fixed plaintext with random keys, and random
plaintext with random keys. Under the most realistic scenario, given 1000
characters per ciphertext, we are able to distinguish the ciphers with greater
than 97% accuracy. In addition, we consider the accuracy of a subset of the
learning techniques as a function of the length of the ciphertext messages.
Somewhat surprisingly, our classic machine learning models perform at least as
well as our deep learning models. We also find that ciphers that are more
similar in design are somewhat more challenging to distinguish, but not as
difficult as might be expected.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - Breaking Indistinguishability with Transfer Learning: A First Look at SPECK32/64 Lightweight Block Ciphers [1.5953412143328967]
We introduce MIND-Crypt, a novel attack framework that uses deep learning (DL) and transfer learning (TL) to challenge the indistinguishability of block ciphers.
Our methodology includes training a DL model with ciphertexts of two messages encrypted using the same key.
For the TL, we use the trained DL model as a feature extractor, and these features are then used to train a shallow machine learning, such as XGBoost.
arXiv Detail & Related papers (2024-05-30T04:40:13Z) - Modeling Linear and Non-linear Layers: An MILP Approach Towards Finding Differential and Impossible Differential Propagations [1.5327660568487471]
We introduce an automatic tool for exploring differential and impossible propagations within a cipher.
The tool is successfully applied to five lightweight block ciphers: Lilliput, GIFT64, SKINNY64, Klein, and M.IBS.
arXiv Detail & Related papers (2024-05-01T10:48:23Z) - GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains.
We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z) - Memorization for Good: Encryption with Autoregressive Language Models [8.645826579841692]
We propose the first symmetric encryption algorithm with autoregressive language models (SELM)
We show that autoregressive LMs can encode arbitrary data into a compact real-valued vector (i.e., encryption) and then losslessly decode the vector to the original message (i.e. decryption) via random subspace optimization and greedy decoding.
arXiv Detail & Related papers (2023-05-15T05:42:34Z) - Revocable Cryptography from Learning with Errors [61.470151825577034]
We build on the no-cloning principle of quantum mechanics and design cryptographic schemes with key-revocation capabilities.
We consider schemes where secret keys are represented as quantum states with the guarantee that, once the secret key is successfully revoked from a user, they no longer have the ability to perform the same functionality as before.
arXiv Detail & Related papers (2023-02-28T18:58:11Z) - Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group.
Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning.
Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z) - Hiding Images in Deep Probabilistic Models [58.23127414572098]
We describe a different computational framework to hide images in deep probabilistic models.
Specifically, we use a DNN to model the probability density of cover images, and hide a secret image in one particular location of the learned distribution.
We demonstrate the feasibility of our SinGAN approach in terms of extraction accuracy and model security.
arXiv Detail & Related papers (2022-10-05T13:33:25Z) - Segmenting Numerical Substitution Ciphers [27.05304607253758]
Deciphering historical substitution ciphers is a challenging problem.
We propose the first automatic methods to segment those ciphers using Byte Pair.
We also propose a method for solving non-deterministic ciphers with existing keys using a lattice and a pretrained language model.
arXiv Detail & Related papers (2022-05-25T06:45:59Z) - Can Sequence-to-Sequence Models Crack Substitution Ciphers? [15.898270650875158]
State-of-the-art decipherment methods use beam search and a neural language model to score candidate hypotheses for a given cipher.
We show that our proposed method can decipher text without explicit language identification and can still be robust to noise.
arXiv Detail & Related papers (2020-12-30T17:16:33Z) - Cryptanalytic Extraction of Neural Network Models [56.738871473622865]
We introduce a differential attack that can efficiently steal the parameters of the remote model up to floating point precision.
Our attack relies on the fact that ReLU neural networks are piecewise linear functions.
We extract models that are 220 times more precise and require 100x fewer queries than prior work.
arXiv Detail & Related papers (2020-03-10T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.