Related papers: On the Linguistic Capacity of Real-Time Counter Automata

On the Linguistic Capacity of Real-Time Counter Automata

URL: http://arxiv.org/abs/2004.06866v2
Date: Thu, 9 Sep 2021 05:24:50 GMT
Title: On the Linguistic Capacity of Real-Time Counter Automata
Authors: William Merrill
Abstract summary: We study the abilities of real-time counter machines as formal grammars. We show that counter languages are closed under complement, union, intersection, and many other common set operations. This work makes general contributions to the theory of formal languages that are of potential interest for understanding recurrent neural networks.
Score: 1.8072051868187933
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Counter machines have achieved a newfound relevance to the field of natural language processing (NLP): recent work suggests some strong-performing recurrent neural networks utilize their memory as counters. Thus, one potential way to understand the success of these networks is to revisit the theory of counter computation. Therefore, we study the abilities of real-time counter machines as formal grammars, focusing on formal properties that are relevant for NLP models. We first show that several variants of the counter machine converge to express the same class of formal languages. We also prove that counter languages are closed under complement, union, intersection, and many other common set operations. Next, we show that counter machines cannot evaluate boolean expressions, even though they can weakly validate their syntax. This has implications for the interpretability and evaluation of neural network systems: successfully matching syntactic patterns does not guarantee that counter memory accurately encodes compositional semantics. Finally, we consider whether counter languages are semilinear. This work makes general contributions to the theory of formal languages that are of potential interest for understanding recurrent neural networks.

Related papers

Implicit Language Models are RNNs: Balancing Parallelization and Expressivity [4.332158627306896]
State-space models (SSMs) and transformers dominate the language modeling landscape. We propose implicit SSMs, which iterate a transformation until convergence to a fixed point. Our approach demonstrates superior state-tracking capabilities on regular languages, surpassing transformers and SSMs.
arXiv Detail & Related papers (2025-02-10T19:59:31Z)
Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers. It is common to instead use proxy tasks that are similar in only an informal sense. We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning [87.73401758641089]
Chain-of-thought (CoT) reasoning has improved the performance of modern language models (LMs) We show that LMs can represent the same family of distributions over strings as probabilistic Turing machines.
arXiv Detail & Related papers (2024-06-20T10:59:02Z)
Lower Bounds on the Expressivity of Recurrent Neural Language Models [47.537525313391164]
Investigation into the representational capacity of neural LMs has predominantly focused on their ability to emphrecognize formal languages. We take a fresh look at the representational capacity of RNN LMs by connecting them to emphprobabilistic FSAs and demonstrate that RNN LMs with linearly bounded precision can express arbitrary regular LMs.
arXiv Detail & Related papers (2024-05-29T16:02:09Z)
On the Computational Complexity and Formal Hierarchy of Second Order Recurrent Neural Networks [59.85314067235965]
We extend the theoretical foundation for the $2nd$-order recurrent network ($2nd$ RNN) We prove there exists a class of a $2nd$ RNN that is Turing-complete with bounded time. We also demonstrate that $2$nd order RNNs, without memory, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars.
arXiv Detail & Related papers (2023-09-26T06:06:47Z)
Nondeterministic Stacks in Neural Networks [0.456877715768796]
We develop a differentiable data structure that efficiently simulates a nondeterministic pushdown automaton. We show that this raises their formal recognition power to arbitrary context-free languages. We also show that an RNN augmented with a nondeterministic stack is capable of surprisingly powerful behavior.
arXiv Detail & Related papers (2023-04-25T16:00:40Z)
Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer [59.73454783958702]
We propose a symbolic reasoning architecture that chains many join operators together to model output logical expressions. In particular, we demonstrate that such an ensemble of join-chains can express a broad subset of ''tree-structured'' first-order logical expressions, named FOET. We find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space.
arXiv Detail & Related papers (2022-10-06T07:39:58Z)
Emergence of Machine Language: Towards Symbolic Intelligence with Neural Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation. By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z)
SyGNS: A Systematic Generalization Testbed Based on Natural Language Semantics [39.845425535943534]
We propose a Systematic Generalization testbed based on Natural language Semantics (SyGNS) We test whether neural networks can systematically parse sentences involving novel combinations of logical expressions such as quantifiers and negation. Experiments show that Transformer and GRU models can generalize to unseen combinations of quantifiers, negations, and modifier that are similar to given training instances in form, but not to the others.
arXiv Detail & Related papers (2021-06-02T11:24:41Z)
Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.