Related papers: ALICE: An Interpretable Neural Architecture for Generalization in Substitution Ciphers

ALICE: An Interpretable Neural Architecture for Generalization in Substitution Ciphers

URL: http://arxiv.org/abs/2509.07282v2
Date: Thu, 25 Sep 2025 01:15:04 GMT
Title: ALICE: An Interpretable Neural Architecture for Generalization in Substitution Ciphers
Authors: Jeff Shen, Lindsay M. Smith,
Abstract summary: We present cryptogram solving as an ideal testbed for studying neural network reasoning and generalization.<n>We develop ALICE, a simple encoder-only Transformer that sets a new state-of-the-art for both accuracy and speed on this decryption problem.<n>Surprisingly, ALICE generalizes to unseen ciphers after training on only $sim1500$ unique ciphers.
Score: 0.3403377445166164
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present cryptogram solving as an ideal testbed for studying neural network reasoning and generalization; models must decrypt text encoded with substitution ciphers, choosing from 26! possible mappings without explicit access to the cipher. We develop ALICE (an Architecture for Learning Interpretable Cryptogram dEcipherment), a simple encoder-only Transformer that sets a new state-of-the-art for both accuracy and speed on this decryption problem. Surprisingly, ALICE generalizes to unseen ciphers after training on only ${\sim}1500$ unique ciphers, a minute fraction ($3.7 \times 10^{-24}$) of the possible cipher space. To enhance interpretability, we introduce a novel bijective decoding head that explicitly models permutations via the Gumbel-Sinkhorn method, enabling direct extraction of learned cipher mappings. Through early exit and probing experiments, we reveal how ALICE progressively refines its predictions in a way that appears to mirror common human strategies -- early layers place greater emphasis on letter frequencies, while later layers form word-level structures. Our architectural innovations and analysis methods are applicable beyond cryptograms and offer new insights into neural network generalization and interpretability.

Related papers

A New Approach in Cryptanalysis Through Combinatorial Equivalence of Cryptosystems [0.0]
We propose a new approach in cryptanalysis based on an evolution of the concept of textitCombinatorial Equivalence.<n>The aim is to rewrite a cryptosystem under aly equivalent form in order to make appear new properties that are more strongly discriminating the secret key used during encryption.
arXiv Detail & Related papers (2026-02-16T08:07:41Z)
Can Transformers Break Encryption Schemes via In-Context Learning? [0.0]
In-context learning (ICL) has emerged as a powerful capability of transformer-based language models.<n>We propose a novel application of ICL into the domain of cryptographic function learning.
arXiv Detail & Related papers (2025-08-13T23:09:32Z)
Using Modular Arithmetic Optimized Neural Networks To Crack Affine Cryptographic Schemes Efficiently [0.27309692684728615]
We investigate the cryptanalysis of affine ciphers using a hybrid neural network architecture.<n>Our approach integrates a modular branch that processes raw ciphertext sequences and a statistical branch that leverages letter frequency features.
arXiv Detail & Related papers (2025-07-17T04:54:10Z)
Singularity Cipher: A Topology-Driven Cryptographic Scheme Based on Visual Paradox and Klein Bottle Illusions [0.0]
The Singularity Cipher integrates topological transformations and visual paradoxes to achieve multidimensional security.<n>The resulting binary data is encoded using perceptual illusions, such as the missing square paradox, to visually obscure the presence of encrypted content.<n>The paper formalizes the architecture, provides encryption and decryption algorithms, evaluates security properties, and compares the method against classical, post-quantum, and steganographic approaches.
arXiv Detail & Related papers (2025-07-02T04:44:52Z)
Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem.<n>These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem.<n>We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z)
Transformers -- Messages in Disguise [3.74142789780782]
NN based cryptography is being investigated due to its ability to learn and implement random cryptographic schemes. R is comprised of three new NN layers: the (i) projection layer, (ii) inverse projection layer, and (iii) dot-product layer. This results in an ANC network that (i) is computationally efficient, (ii) ensures the encrypted message is unique, and (iii) does not induce any communication overhead.
arXiv Detail & Related papers (2024-11-15T15:42:29Z)
The Evolution of Cryptography through Number Theory [55.2480439325792]
cryptography began around 100 years ago, its roots trace back to ancient civilizations like Mesopotamia and Egypt.<n>This paper explores the link between early information hiding techniques and modern cryptographic algorithms like RSA.
arXiv Detail & Related papers (2024-11-11T16:27:57Z)
Cloning Games, Black Holes and Cryptography [50.022147589030304]
We introduce a new toolkit for analyzing cloning games.<n>This framework allows us to analyze a new cloning game based on binary phase states.<n>We show that the binary phase variantally optimal bound offers quantitative insights into information scrambling in idealized models of black holes.
arXiv Detail & Related papers (2024-11-07T14:09:32Z)
GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding [52.14832976759585]
Grammatical error correction (GEC) is an important NLP task that is usually solved with autoregressive sequence-to-sequence models. We propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network. We show that the resulting network improves over previously known non-autoregressive methods for GEC.
arXiv Detail & Related papers (2023-11-14T14:24:36Z)
Classifying World War II Era Ciphers with Machine Learning [1.6317061277457]
We classify Enigma, M-209, Sigaba, Purple, and Typex ciphers from World War II era. We find that classic machine learning models perform at least as well as deep learning models. ciphers that are more similar in design are somewhat more challenging to distinguish, but not as difficult as might be expected.
arXiv Detail & Related papers (2023-07-02T07:20:47Z)
Revocable Cryptography from Learning with Errors [61.470151825577034]
We build on the no-cloning principle of quantum mechanics and design cryptographic schemes with key-revocation capabilities. We consider schemes where secret keys are represented as quantum states with the guarantee that, once the secret key is successfully revoked from a user, they no longer have the ability to perform the same functionality as before.
arXiv Detail & Related papers (2023-02-28T18:58:11Z)
Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z)
Segmenting Numerical Substitution Ciphers [27.05304607253758]
Deciphering historical substitution ciphers is a challenging problem. We propose the first automatic methods to segment those ciphers using Byte Pair. We also propose a method for solving non-deterministic ciphers with existing keys using a lattice and a pretrained language model.
arXiv Detail & Related papers (2022-05-25T06:45:59Z)
Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z)
Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. Existing neural models have been shown to lack this basic ability in learning symbolic structures. We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z)
Can Sequence-to-Sequence Models Crack Substitution Ciphers? [15.898270650875158]
State-of-the-art decipherment methods use beam search and a neural language model to score candidate hypotheses for a given cipher. We show that our proposed method can decipher text without explicit language identification and can still be robust to noise.
arXiv Detail & Related papers (2020-12-30T17:16:33Z)
Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder. We train a Transformer-based sequence encoder over a large set of short sequences. Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.