Sequence-to-Sequence Learning with Latent Neural Grammars
- URL: http://arxiv.org/abs/2109.01135v1
- Date: Thu, 2 Sep 2021 17:58:08 GMT
- Title: Sequence-to-Sequence Learning with Latent Neural Grammars
- Authors: Yoon Kim
- Abstract summary: Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks.
While flexible and performant, these models often require large datasets for training and can fail spectacularly on benchmarks designed to test for compositional generalization.
This work explores an alternative, hierarchical approach to sequence-to-sequence learning with quasi-synchronous grammars.
- Score: 12.624691611049341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequence-to-sequence learning with neural networks has become the de facto
standard for sequence prediction tasks. This approach typically models the
local distribution over the next word with a powerful neural network that can
condition on arbitrary context. While flexible and performant, these models
often require large datasets for training and can fail spectacularly on
benchmarks designed to test for compositional generalization. This work
explores an alternative, hierarchical approach to sequence-to-sequence learning
with quasi-synchronous grammars, where each node in the target tree is
transduced by a node in the source tree. Both the source and target trees are
treated as latent and induced during training. We develop a neural
parameterization of the grammar which enables parameter sharing over the
combinatorial space of derivation rules without the need for manual feature
engineering. We apply this latent neural grammar to various domains -- a
diagnostic language navigation task designed to test for compositional
generalization (SCAN), style transfer, and small-scale machine translation --
and find that it performs respectably compared to standard baselines.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - SLFNet: Generating Semantic Logic Forms from Natural Language Using Semantic Probability Graphs [6.689539418123863]
Building natural language interfaces typically uses a semanticSlot to parse the user's natural language and convert it into structured textbfSemantic textbfLogic textbfForms (SLFs)
We propose a novel neural network, SLFNet, which incorporates dependent syntactic information as prior knowledge and can capture the long-range interactions between contextual information and words.
Experiments show that SLFNet achieves state-of-the-art performance on the ChineseQCI-TS and Okapi datasets, and competitive performance on the ATIS dataset
arXiv Detail & Related papers (2024-03-29T02:42:39Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - A Multi-Grained Self-Interpretable Symbolic-Neural Model For
Single/Multi-Labeled Text Classification [29.075766631810595]
We propose a Symbolic-Neural model that can learn to explicitly predict class labels of text spans from a constituency tree.
As the structured language model learns to predict constituency trees in a self-supervised manner, only raw texts and sentence-level labels are required as training data.
Our experiments demonstrate that our approach could achieve good prediction accuracy in downstream tasks.
arXiv Detail & Related papers (2023-03-06T03:25:43Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Neural-Symbolic Recursive Machine for Systematic Generalization [113.22455566135757]
We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS)
NSR integrates neural perception, syntactic parsing, and semantic reasoning.
We evaluate NSR's efficacy across four challenging benchmarks designed to probe systematic generalization capabilities.
arXiv Detail & Related papers (2022-10-04T13:27:38Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - SyGNS: A Systematic Generalization Testbed Based on Natural Language
Semantics [39.845425535943534]
We propose a Systematic Generalization testbed based on Natural language Semantics (SyGNS)
We test whether neural networks can systematically parse sentences involving novel combinations of logical expressions such as quantifiers and negation.
Experiments show that Transformer and GRU models can generalize to unseen combinations of quantifiers, negations, and modifier that are similar to given training instances in form, but not to the others.
arXiv Detail & Related papers (2021-06-02T11:24:41Z) - Can RNNs learn Recursive Nested Subject-Verb Agreements? [4.094098809740732]
Language processing requires the ability to extract nested tree structures.
Recent advances in Recurrent Neural Networks (RNNs) achieve near-human performance in some language tasks.
arXiv Detail & Related papers (2021-01-06T20:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.