Related papers: Sequence-to-Sequence Learning with Latent Neural Grammars

Sequence-to-Sequence Learning with Latent Neural Grammars

URL: http://arxiv.org/abs/2109.01135v1
Date: Thu, 2 Sep 2021 17:58:08 GMT
Title: Sequence-to-Sequence Learning with Latent Neural Grammars
Authors: Yoon Kim
Abstract summary: Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks. While flexible and performant, these models often require large datasets for training and can fail spectacularly on benchmarks designed to test for compositional generalization. This work explores an alternative, hierarchical approach to sequence-to-sequence learning with quasi-synchronous grammars.
Score: 12.624691611049341
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks. This approach typically models the local distribution over the next word with a powerful neural network that can condition on arbitrary context. While flexible and performant, these models often require large datasets for training and can fail spectacularly on benchmarks designed to test for compositional generalization. This work explores an alternative, hierarchical approach to sequence-to-sequence learning with quasi-synchronous grammars, where each node in the target tree is transduced by a node in the source tree. Both the source and target trees are treated as latent and induced during training. We develop a neural parameterization of the grammar which enables parameter sharing over the combinatorial space of derivation rules without the need for manual feature engineering. We apply this latent neural grammar to various domains -- a diagnostic language navigation task designed to test for compositional generalization (SCAN), style transfer, and small-scale machine translation -- and find that it performs respectably compared to standard baselines.

Related papers

Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers. It is common to instead use proxy tasks that are similar in only an informal sense. We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
SLFNet: Generating Semantic Logic Forms from Natural Language Using Semantic Probability Graphs [6.689539418123863]
Building natural language interfaces typically uses a semanticSlot to parse the user's natural language and convert it into structured textbfSemantic textbfLogic textbfForms (SLFs) We propose a novel neural network, SLFNet, which incorporates dependent syntactic information as prior knowledge and can capture the long-range interactions between contextual information and words. Experiments show that SLFNet achieves state-of-the-art performance on the ChineseQCI-TS and Okapi datasets, and competitive performance on the ATIS dataset
arXiv Detail & Related papers (2024-03-29T02:42:39Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
A Multi-Grained Self-Interpretable Symbolic-Neural Model For Single/Multi-Labeled Text Classification [29.075766631810595]
We propose a Symbolic-Neural model that can learn to explicitly predict class labels of text spans from a constituency tree. As the structured language model learns to predict constituency trees in a self-supervised manner, only raw texts and sentence-level labels are required as training data. Our experiments demonstrate that our approach could achieve good prediction accuracy in downstream tasks.
arXiv Detail & Related papers (2023-03-06T03:25:43Z)
Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z)
Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z)
Neural-Symbolic Recursive Machine for Systematic Generalization [113.22455566135757]
We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS) NSR integrates neural perception, syntactic parsing, and semantic reasoning. We evaluate NSR's efficacy across four challenging benchmarks designed to probe systematic generalization capabilities.
arXiv Detail & Related papers (2022-10-04T13:27:38Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
SyGNS: A Systematic Generalization Testbed Based on Natural Language Semantics [39.845425535943534]
We propose a Systematic Generalization testbed based on Natural language Semantics (SyGNS) We test whether neural networks can systematically parse sentences involving novel combinations of logical expressions such as quantifiers and negation. Experiments show that Transformer and GRU models can generalize to unseen combinations of quantifiers, negations, and modifier that are similar to given training instances in form, but not to the others.
arXiv Detail & Related papers (2021-06-02T11:24:41Z)
Can RNNs learn Recursive Nested Subject-Verb Agreements? [4.094098809740732]
Language processing requires the ability to extract nested tree structures. Recent advances in Recurrent Neural Networks (RNNs) achieve near-human performance in some language tasks.
arXiv Detail & Related papers (2021-01-06T20:47:02Z)
NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data. NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics. We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.