Improving Compositional Generalization with Self-Training for
Data-to-Text Generation
- URL: http://arxiv.org/abs/2110.08467v1
- Date: Sat, 16 Oct 2021 04:26:56 GMT
- Title: Improving Compositional Generalization with Self-Training for
Data-to-Text Generation
- Authors: Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur Parikh,
Hongtao Zhong, Emma Strubell
- Abstract summary: We study the compositional generalization of current generation models in data-to-text tasks.
By simulating structural shifts in the compositional Weather dataset, we show that T5 models fail to generalize to unseen structures.
We propose an approach based on self-training using finetuned BLEURT for pseudo-response selection.
- Score: 36.973617793800315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-to-text generation focuses on generating fluent natural language
responses from structured semantic representations. Such representations are
compositional, allowing for the combination of atomic meaning schemata in
various ways to express the rich semantics in natural language. Recently,
pretrained language models (LMs) have achieved impressive results on
data-to-text tasks, though it remains unclear the extent to which these LMs
generalize to new semantic representations. In this work, we systematically
study the compositional generalization of current state-of-the-art generation
models in data-to-text tasks. By simulating structural shifts in the
compositional Weather dataset, we show that T5 models fail to generalize to
unseen structures. Next, we show that template-based input representations
greatly improve the model performance and model scale does not trivially solve
the lack of generalization. To further improve the model's performance, we
propose an approach based on self-training using finetuned BLEURT for
pseudo-response selection. Extensive experiments on the few-shot Weather and
multi-domain SGD datasets demonstrate strong gains of our method.
Related papers
- Relation-based Counterfactual Data Augmentation and Contrastive Learning for Robustifying Natural Language Inference Models [0.0]
We propose a method in which we use token-based and sentence-based augmentation methods to generate counterfactual sentence pairs.
We show that the proposed method can improve the performance and robustness of the NLI model.
arXiv Detail & Related papers (2024-10-28T03:43:25Z) - Detection and Measurement of Syntactic Templates in Generated Text [58.111650675717414]
We offer an analysis of syntactic features to characterize general repetition in models.
We find that models tend to produce templated text in downstream tasks at a higher rate than what is found in human-reference texts.
arXiv Detail & Related papers (2024-06-28T19:34:23Z) - Structured Language Generation Model for Robust Structure Prediction [6.4736137270915215]
We propose a framework that reduces sequence-to-sequence problems to classification problems via methodologies in loss calibration and decoding method.
Our experimental results show that SLGM is able to maintain performance without explicit dataset information, follow and potentially replace dataset-specific fine-tuning.
arXiv Detail & Related papers (2024-02-14T06:33:22Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - Learning Disentangled Representations for Natural Language Definitions [0.0]
We argue that recurrent syntactic and semantic regularities in textual data can be used to provide the models with both structural biases and generative factors.
We leverage the semantic structures present in a representative and semantically dense category of sentence types, definitional sentences, for training a Variational Autoencoder to learn disentangled representations.
arXiv Detail & Related papers (2022-09-22T14:31:55Z) - Transformer Grammars: Augmenting Transformer Language Models with
Syntactic Inductive Biases at Scale [31.293175512404172]
We introduce Transformer Grammars -- a class of Transformer language models that combine expressive power, scalability, and strong performance of Transformers.
We find that Transformer Grammars outperform various strong baselines on multiple syntax-sensitive language modeling evaluation metrics.
arXiv Detail & Related papers (2022-03-01T17:22:31Z) - Improving Compositional Generalization with Latent Structure and Data
Augmentation [39.24527889685699]
We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL)
CSL is a generative model with a quasi-synchronous context-free grammar backbone.
This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks.
arXiv Detail & Related papers (2021-12-14T18:03:28Z) - Learning to Synthesize Data for Semantic Parsing [57.190817162674875]
We propose a generative model which models the composition of programs and maps a program to an utterance.
Due to the simplicity of PCFG and pre-trained BART, our generative model can be efficiently learned from existing data at hand.
We evaluate our method in both in-domain and out-of-domain settings of text-to-Query parsing on the standard benchmarks of GeoQuery and Spider.
arXiv Detail & Related papers (2021-04-12T21:24:02Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.