Related papers: Segmenting Numerical Substitution Ciphers

Segmenting Numerical Substitution Ciphers

URL: http://arxiv.org/abs/2205.12527v1
Date: Wed, 25 May 2022 06:45:59 GMT
Title: Segmenting Numerical Substitution Ciphers
Authors: Nada Aldarrab, Jonathan May
Abstract summary: Deciphering historical substitution ciphers is a challenging problem. We propose the first automatic methods to segment those ciphers using Byte Pair. We also propose a method for solving non-deterministic ciphers with existing keys using a lattice and a pretrained language model.
Score: 27.05304607253758
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deciphering historical substitution ciphers is a challenging problem. Example problems that have been previously studied include detecting cipher type, detecting plaintext language, and acquiring the substitution key for segmented ciphers. However, attacking unsegmented, space-free ciphers is still a challenging task. Segmentation (i.e. finding substitution units) is the first step towards cracking those ciphers. In this work, we propose the first automatic methods to segment those ciphers using Byte Pair Encoding (BPE) and unigram language models. Our methods achieve an average segmentation error of 2\% on 100 randomly-generated monoalphabetic ciphers and 27\% on 3 real homophonic ciphers. We also propose a method for solving non-deterministic ciphers with existing keys using a lattice and a pretrained language model. Our method leads to the full solution of the IA cipher; a real historical cipher that has not been fully solved until this work.

Related papers

Post-Quantum Cryptography: An Analysis of Code-Based and Lattice-Based Cryptosystems [55.49917140500002]
Quantum computers will be able to break modern cryptographic systems using Shor's Algorithm.<n>We first examine the McEliece cryptosystem, a code-based scheme believed to be secure against quantum attacks.<n>We then explore NTRU, a lattice-based system grounded in the difficulty of solving the Shortest Vector Problem.
arXiv Detail & Related papers (2025-05-06T03:42:38Z)
Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem. These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem. We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z)
The Evolution of Cryptography through Number Theory [55.2480439325792]
cryptography began around 100 years ago, its roots trace back to ancient civilizations like Mesopotamia and Egypt. This paper explores the link between early information hiding techniques and modern cryptographic algorithms like RSA.
arXiv Detail & Related papers (2024-11-11T16:27:57Z)
Three-Input Ciphertext Multiplication for Homomorphic Encryption [6.390468088226496]
Homomorphic encryption (HE) allows computations directly on ciphertexts. HE is essential to privacy-preserving computing, such as neural network inference, medical diagnosis, and financial data analysis. This paper proposes 3-input ciphertext multiplication to reduce complexity of computations.
arXiv Detail & Related papers (2024-10-17T13:40:49Z)
A Machine Learning-Based Framework for Assessing Cryptographic Indistinguishability of Lightweight Block Ciphers [1.5953412143328967]
Indistinguishability is a fundamental principle of cryptographic security, crucial for securing data transmitted between Internet of Things (IoT) devices. This research investigates the ability of machine learning (ML) in assessing indistinguishability property in encryption systems. We introduce MIND-Crypt, a novel ML-based framework designed to assess the cryptographic indistinguishability of lightweight block ciphers.
arXiv Detail & Related papers (2024-05-30T04:40:13Z)
Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures. We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem. SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z)
GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding [52.14832976759585]
Grammatical error correction (GEC) is an important NLP task that is usually solved with autoregressive sequence-to-sequence models. We propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network. We show that the resulting network improves over previously known non-autoregressive methods for GEC.
arXiv Detail & Related papers (2023-11-14T14:24:36Z)
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher [85.18213923151717]
Experimental results show certain ciphers succeed almost 100% of the time to bypass the safety alignment of GPT-4 in several safety domains. We propose a novel SelfCipher that uses only role play and several demonstrations in natural language to evoke this capability.
arXiv Detail & Related papers (2023-08-12T04:05:57Z)
Classifying World War II Era Ciphers with Machine Learning [1.6317061277457]
We classify Enigma, M-209, Sigaba, Purple, and Typex ciphers from World War II era. We find that classic machine learning models perform at least as well as deep learning models. ciphers that are more similar in design are somewhat more challenging to distinguish, but not as difficult as might be expected.
arXiv Detail & Related papers (2023-07-02T07:20:47Z)
CipherSniffer: Classifying Cipher Types [0.0]
We frame the decryption task as a classification problem. We first create a dataset of transpositions, substitutions, text reversals, word reversals, sentence shifts, and unencrypted text.
arXiv Detail & Related papers (2023-06-13T20:18:24Z)
Revocable Cryptography from Learning with Errors [61.470151825577034]
We build on the no-cloning principle of quantum mechanics and design cryptographic schemes with key-revocation capabilities. We consider schemes where secret keys are represented as quantum states with the guarantee that, once the secret key is successfully revoked from a user, they no longer have the ability to perform the same functionality as before.
arXiv Detail & Related papers (2023-02-28T18:58:11Z)
A Non-monotonic Self-terminating Language Model [62.93465126911921]
In this paper, we focus on the problem of non-terminating sequences resulting from an incomplete decoding algorithm. We first define an incomplete probable decoding algorithm which includes greedy search, top-$k$ sampling, and nucleus sampling. We then propose a non-monotonic self-terminating language model, which relaxes the constraint of monotonically increasing termination probability.
arXiv Detail & Related papers (2022-10-03T00:28:44Z)
Recovering AES Keys with a Deep Cold Boot Attack [91.22679787578438]
Cold boot attacks inspect the corrupted random access memory soon after the power has been shut down. In this work, we combine a novel cryptographic variant of a deep error correcting code technique with a modified SAT solver scheme to apply the attack on AES keys. Our results show that our methods outperform the state of the art attack methods by a very large margin.
arXiv Detail & Related papers (2021-06-09T07:57:01Z)
Can Sequence-to-Sequence Models Crack Substitution Ciphers? [15.898270650875158]
State-of-the-art decipherment methods use beam search and a neural language model to score candidate hypotheses for a given cipher. We show that our proposed method can decipher text without explicit language identification and can still be robust to noise.
arXiv Detail & Related papers (2020-12-30T17:16:33Z)
A Few-shot Learning Approach for Historical Ciphered Manuscript Recognition [3.0682439731292592]
We propose a novel method for handwritten ciphers recognition based on few-shot object detection. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets.
arXiv Detail & Related papers (2020-09-26T11:49:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.