Segmenting Numerical Substitution Ciphers
- URL: http://arxiv.org/abs/2205.12527v1
- Date: Wed, 25 May 2022 06:45:59 GMT
- Title: Segmenting Numerical Substitution Ciphers
- Authors: Nada Aldarrab, Jonathan May
- Abstract summary: Deciphering historical substitution ciphers is a challenging problem.
We propose the first automatic methods to segment those ciphers using Byte Pair.
We also propose a method for solving non-deterministic ciphers with existing keys using a lattice and a pretrained language model.
- Score: 27.05304607253758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deciphering historical substitution ciphers is a challenging problem. Example
problems that have been previously studied include detecting cipher type,
detecting plaintext language, and acquiring the substitution key for segmented
ciphers. However, attacking unsegmented, space-free ciphers is still a
challenging task. Segmentation (i.e. finding substitution units) is the first
step towards cracking those ciphers. In this work, we propose the first
automatic methods to segment those ciphers using Byte Pair Encoding (BPE) and
unigram language models. Our methods achieve an average segmentation error of
2\% on 100 randomly-generated monoalphabetic ciphers and 27\% on 3 real
homophonic ciphers. We also propose a method for solving non-deterministic
ciphers with existing keys using a lattice and a pretrained language model. Our
method leads to the full solution of the IA cipher; a real historical cipher
that has not been fully solved until this work.
Related papers
- Three-Input Ciphertext Multiplication for Homomorphic Encryption [6.390468088226496]
Homomorphic encryption (HE) allows computations directly on ciphertexts.
HE is essential to privacy-preserving computing, such as neural network inference, medical diagnosis, and financial data analysis.
This paper proposes 3-input ciphertext multiplication to reduce complexity of computations.
arXiv Detail & Related papers (2024-10-17T13:40:49Z) - Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.
We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.
SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z) - GEC-DePenD: Non-Autoregressive Grammatical Error Correction with
Decoupled Permutation and Decoding [52.14832976759585]
Grammatical error correction (GEC) is an important NLP task that is usually solved with autoregressive sequence-to-sequence models.
We propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network.
We show that the resulting network improves over previously known non-autoregressive methods for GEC.
arXiv Detail & Related papers (2023-11-14T14:24:36Z) - GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains.
We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z) - Classifying World War II Era Ciphers with Machine Learning [1.6317061277457]
We classify Enigma, M-209, Sigaba, Purple, and Typex ciphers from World War II era.
We find that classic machine learning models perform at least as well as deep learning models.
ciphers that are more similar in design are somewhat more challenging to distinguish, but not as difficult as might be expected.
arXiv Detail & Related papers (2023-07-02T07:20:47Z) - CipherSniffer: Classifying Cipher Types [0.0]
We frame the decryption task as a classification problem.
We first create a dataset of transpositions, substitutions, text reversals, word reversals, sentence shifts, and unencrypted text.
arXiv Detail & Related papers (2023-06-13T20:18:24Z) - Revocable Cryptography from Learning with Errors [61.470151825577034]
We build on the no-cloning principle of quantum mechanics and design cryptographic schemes with key-revocation capabilities.
We consider schemes where secret keys are represented as quantum states with the guarantee that, once the secret key is successfully revoked from a user, they no longer have the ability to perform the same functionality as before.
arXiv Detail & Related papers (2023-02-28T18:58:11Z) - A Non-monotonic Self-terminating Language Model [62.93465126911921]
In this paper, we focus on the problem of non-terminating sequences resulting from an incomplete decoding algorithm.
We first define an incomplete probable decoding algorithm which includes greedy search, top-$k$ sampling, and nucleus sampling.
We then propose a non-monotonic self-terminating language model, which relaxes the constraint of monotonically increasing termination probability.
arXiv Detail & Related papers (2022-10-03T00:28:44Z) - Recovering AES Keys with a Deep Cold Boot Attack [91.22679787578438]
Cold boot attacks inspect the corrupted random access memory soon after the power has been shut down.
In this work, we combine a novel cryptographic variant of a deep error correcting code technique with a modified SAT solver scheme to apply the attack on AES keys.
Our results show that our methods outperform the state of the art attack methods by a very large margin.
arXiv Detail & Related papers (2021-06-09T07:57:01Z) - Can Sequence-to-Sequence Models Crack Substitution Ciphers? [15.898270650875158]
State-of-the-art decipherment methods use beam search and a neural language model to score candidate hypotheses for a given cipher.
We show that our proposed method can decipher text without explicit language identification and can still be robust to noise.
arXiv Detail & Related papers (2020-12-30T17:16:33Z) - A Few-shot Learning Approach for Historical Ciphered Manuscript
Recognition [3.0682439731292592]
We propose a novel method for handwritten ciphers recognition based on few-shot object detection.
By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets.
arXiv Detail & Related papers (2020-09-26T11:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.