Extracting Moore Machines from Transformers using Queries and Counterexamples
- URL: http://arxiv.org/abs/2410.06045v2
- Date: Sun, 28 Sep 2025 13:25:45 GMT
- Title: Extracting Moore Machines from Transformers using Queries and Counterexamples
- Authors: Rik Adriaensen, Jaron Maene,
- Abstract summary: We construct finite state automata as high-level abstractions of transformers trained on regular languages.<n>We extract Moore machines, as many training tasks used in literature can be mapped onto them.<n>We demonstrate the usefulness of this approach by studying positive-only learning and the sequence accuracy measure in detail.
- Score: 6.612713406498215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fuelled by the popularity of the transformer architecture in deep learning, several works have investigated what formal languages a transformer can learn from data. Nonetheless, existing results remain hard to compare due to methodological differences. To address this, we construct finite state automata as high-level abstractions of transformers trained on regular languages using queries and counterexamples. Concretely, we extract Moore machines, as many training tasks used in literature can be mapped onto them. We demonstrate the usefulness of this approach by studying positive-only learning and the sequence accuracy measure in detail.
Related papers
- Probability Distributions Computed by Hard-Attention Transformers [53.17368795629463]
We show that making transformer language recognizers autoregressive can sometimes increase their expressivity.<n>Our overall contribution is to tease apart what functions transformers are capable of expressing, in their most common use-case as language models.
arXiv Detail & Related papers (2025-10-31T02:41:05Z) - Learning Spectral Methods by Transformers [18.869174453242383]
We show that multi-layered Transformers, given a sufficiently large set of pre-training instances, are able to learn the algorithms themselves.<n>This learning paradigm is distinct from the in-context learning setup and is similar to the learning procedure of human brains.
arXiv Detail & Related papers (2025-01-02T15:53:25Z) - Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized.
We find that these random transformers can perform a wide range of meaningful algorithmic tasks.
Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z) - Can Transformers Learn $n$-gram Language Models? [77.35809823602307]
We study transformers' ability to learn random $n$-gram LMs of two kinds.
We find that classic estimation techniques for $n$-gram LMs such as add-$lambda$ smoothing outperform transformers.
arXiv Detail & Related papers (2024-10-03T21:21:02Z) - Automata Extraction from Transformers [5.419884861365132]
We propose an automata extraction algorithm specifically designed for Transformer models.
Treating the Transformer model as a black-box system, we track the model through the transformation process of their internal latent representations.
We then use classical pedagogical approaches like L* algorithm to interpret them as deterministic finite-state automata.
arXiv Detail & Related papers (2024-06-08T20:07:24Z) - A Transformer with Stack Attention [84.18399019794036]
We propose augmenting transformer-based language models with a differentiable, stack-based attention mechanism.
Our stack-based attention mechanism can be incorporated into any transformer-based language model and adds a level of interpretability to the model.
We show that the addition of our stack-based attention mechanism enables the transformer to model some, but not all, deterministic context-free languages.
arXiv Detail & Related papers (2024-05-07T17:47:57Z) - Transformers Can Represent $n$-gram Language Models [56.06361029539347]
We focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models.
We show that transformer LMs using the hard or sparse attention mechanisms can exactly represent any $n$-gram LM.
arXiv Detail & Related papers (2024-04-23T12:51:37Z) - Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages [7.938342455750221]
We study exact characterizations of transformers with hard attention and attention masking.
With strict masking (each position cannot attend to itself) and without position embeddings, these transformers are expressively equivalent to linear temporal logic.
arXiv Detail & Related papers (2023-10-21T03:26:39Z) - The Expressive Power of Transformers with Chain of Thought [29.839710738657203]
In practice, transformers can be improved by allowing them to use a "chain of thought" or "scratchpad"
We show that the answer is yes, but the amount of increase depends crucially on the amount of intermediate generation.
Our results also imply that linear steps keep transformer decoders within context-sensitive languages.
arXiv Detail & Related papers (2023-10-11T22:35:18Z) - Characterizing Intrinsic Compositionality in Transformers with Tree
Projections [72.45375959893218]
neural models like transformers can route information arbitrarily between different parts of their input.
We show that transformers for three different tasks become more treelike over the course of training.
These trees are predictive of model behavior, with more tree-like models generalizing better on tests of compositional generalization.
arXiv Detail & Related papers (2022-11-02T17:10:07Z) - Structural Biases for Improving Transformers on Translation into
Morphologically Rich Languages [120.74406230847904]
TP-Transformer augments the traditional Transformer architecture to include an additional component to represent structure.
The second method imbues structure at the data level by segmenting the data with morphological tokenization.
We find that each of these two approaches allows the network to achieve better performance, but this improvement is dependent on the size of the dataset.
arXiv Detail & Related papers (2022-08-11T22:42:24Z) - Combining Transformers with Natural Language Explanations [13.167758466408825]
We propose an extension to transformer models that makes use of external memories to store natural language explanations and use them to explain classification outputs.
We conduct an experimental evaluation on two domains, legal text analysis and argument mining, to show that our approach can produce relevant explanations while retaining or even improving classification performance.
arXiv Detail & Related papers (2021-09-02T09:17:04Z) - On the Power of Saturated Transformers: A View from Circuit Complexity [87.20342701232869]
We show that saturated transformers transcend the limitations of hard-attention transformers.
The jump from hard to saturated attention can be understood as increasing the transformer's effective circuit depth by a factor of $O(log n)$.
arXiv Detail & Related papers (2021-06-30T17:09:47Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z) - Scalable Transformers for Neural Machine Translation [86.4530299266897]
Transformer has been widely adopted in Neural Machine Translation (NMT) because of its large capacity and parallel training of sequence generation.
We propose a novel scalable Transformers, which naturally contains sub-Transformers of different scales and have shared parameters.
A three-stage training scheme is proposed to tackle the difficulty of training the scalable Transformers.
arXiv Detail & Related papers (2021-06-04T04:04:10Z) - Transformer visualization via dictionary learning: contextualized
embedding as a linear superposition of transformer factors [15.348047288817478]
We propose to use dictionary learning to open up "black boxes" as linear superpositions of transformer factors.
Through visualization, we demonstrate the hierarchical semantic structures captured by the transformer factors.
We hope this visualization tool can bring further knowledge and a better understanding of how transformer networks work.
arXiv Detail & Related papers (2021-03-29T20:51:33Z) - On the Ability and Limitations of Transformers to Recognize Formal
Languages [9.12267978757844]
We provide a construction of Transformers for a subclass of counter languages.
We find that Transformers do well on this subclass, and their learned mechanism strongly correlates with our construction.
Perhaps surprisingly, in contrast to LSTMs, Transformers do well only on a subset of regular languages with degrading performance.
arXiv Detail & Related papers (2020-09-23T17:21:33Z) - Addressing Some Limitations of Transformers with Feedback Memory [51.94640029417114]
Transformers have been successfully applied to sequential, auto-regressive tasks despite being feedforward networks.
We propose the Feedback Transformer architecture that exposes all previous representations to all future representations.
We demonstrate on a variety of benchmarks in language modeling, machine translation, and reinforcement learning that the increased representation capacity can create small, shallow models with much stronger performance than comparable Transformers.
arXiv Detail & Related papers (2020-02-21T16:37:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.