Self-training from Self-memory in Data-to-text Generation
- URL: http://arxiv.org/abs/2401.10567v1
- Date: Fri, 19 Jan 2024 09:13:28 GMT
- Title: Self-training from Self-memory in Data-to-text Generation
- Authors: Hoang-Thang Ta
- Abstract summary: This paper introduces a novel training model, self-training from self-memory (STSM) in data-to-text generation (DTG)
The quality of self-memory is validated by two models, data-to-text (D2T) and text-to-data (T2D)
- Score: 3.844398528249339
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces a novel training model, self-training from self-memory
(STSM) in data-to-text generation (DTG), allowing the model to self-train on
subsets, including self-memory as outputs inferred directly from the trained
models and/or the new data. The quality of self-memory is validated by two
models, data-to-text (D2T) and text-to-data (T2D), by two pre-defined
conditions: (1) the appearance of all source values in the outputs of the D2T
model and (2) the ability to convert back to source data in the outputs in the
T2D model. We utilize a greedy algorithm to generate shorter D2T outputs if
they contain all source values. Subsequently, we use the T2D model to confirm
that these outputs can capture input relationships by demonstrating their
capacity to convert text back into data. With 30% of the dataset, we can train
the D2T model with a competitive performance compared to full training in the
same setup. We experiment with our model on two datasets, E2E NLG and DART.
STSM offers the D2T model a generalization capability from its subset memory
while reducing training data volume. Ultimately, we anticipate that this paper
will contribute to continual learning solutions that adapt to new training
data, incorporating it as a form of self-memory in DTG tasks. The curated
dataset is publicly available at: https://github.com/hoangthangta/STSM.
Related papers
- Triples-to-isiXhosa (T2X): Addressing the Challenges of Low-Resource
Agglutinative Data-to-Text Generation [9.80836683456026]
We tackle data-to-text for isiXhosa, which is low-resource and agglutinative.
We introduce Triples-to-isiXhosa (T2X), a new dataset based on a subset of WebNLG.
We develop an evaluation framework for T2X that measures how accurately generated text describes the data.
arXiv Detail & Related papers (2024-03-12T11:53:27Z) - SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with
Auto-Generated Data [73.23388142296535]
SELMA improves the faithfulness of T2I models by fine-tuning models on automatically generated, multi-skill image-text datasets.
We show that SELMA significantly improves the semantic alignment and text faithfulness of state-of-the-art T2I diffusion models on multiple benchmarks.
We also show that fine-tuning with image-text pairs auto-collected via SELMA shows comparable performance to fine-tuning with ground truth data.
arXiv Detail & Related papers (2024-03-11T17:35:33Z) - HoloDiffusion: Training a 3D Diffusion Model using 2D Images [71.1144397510333]
We introduce a new diffusion setup that can be trained, end-to-end, with only posed 2D images for supervision.
We show that our diffusion models are scalable, train robustly, and are competitive in terms of sample quality and fidelity to existing approaches for 3D generative modeling.
arXiv Detail & Related papers (2023-03-29T07:35:56Z) - Learning from Multiple Sources for Data-to-Text and Text-to-Data [16.080265665849527]
Data-to-text (D2T) and text-to-data (T2D) are dual tasks that convert structured data, such as graphs or tables into fluent text, and vice versa.
Current systems leverage pre-trained language models fine-tuned on D2T or T2D tasks.
This approach has two main limitations: first, a separate system has to be tuned for each task and source; second, learning is limited by the scarcity of available corpora.
We introduce a variational auto-encoder model with disentangled style and content variables that allows us to represent the diversity that
arXiv Detail & Related papers (2023-02-22T10:39:33Z) - What Makes Data-to-Text Generation Hard for Pretrained Language Models? [17.07349898176898]
Expressing natural language descriptions of structured facts or relations -- data-to-text generation (D2T) -- increases the accessibility of structured knowledge repositories.
Previous work shows that pre-trained language models(PLMs) perform remarkably well on this task after fine-tuning on a significant amount of task-specific training data.
We conduct an empirical study of both fine-tuned and auto-regressive PLMs on the DART multi-domain D2T dataset.
arXiv Detail & Related papers (2022-05-23T17:58:39Z) - Neural Pipeline for Zero-Shot Data-to-Text Generation [3.42658286826597]
We propose to generate text by transforming single-item descriptions with a sequence of modules trained on general-domain text-based operations.
Our experiments on two major triple-to-text datasets -- WebNLG and E2E -- show that our approach enables D2T generation from RDF triples in zero-shot settings.
arXiv Detail & Related papers (2022-03-30T13:14:35Z) - Dual-Teacher Class-Incremental Learning With Data-Free Generative Replay [49.691610143011566]
We propose two novel knowledge transfer techniques for class-incremental learning (CIL)
First, we propose data-free generative replay (DF-GR) to mitigate catastrophic forgetting in CIL by using synthetic samples from a generative model.
Second, we introduce dual-teacher information distillation (DT-ID) for knowledge distillation from two teachers to one student.
arXiv Detail & Related papers (2021-06-17T22:13:15Z) - Evaluating Semantic Accuracy of Data-to-Text Generation with Natural
Language Inference [3.42658286826597]
A major challenge in evaluating data-to-text (D2T) generation is measuring the semantic accuracy of the generated text.
We propose a new metric for evaluating the semantic accuracy of D2T generation based on a neural model pretrained for natural language inference (NLI)
Our experiments on two recent D2T datasets show that our metric can achieve high accuracy in identifying erroneous system outputs.
arXiv Detail & Related papers (2020-11-21T16:37:28Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Data from Model: Extracting Data from Non-robust and Robust Models [83.60161052867534]
This work explores the reverse process of generating data from a model, attempting to reveal the relationship between the data and the model.
We repeat the process of Data to Model (DtM) and Data from Model (DfM) in sequence and explore the loss of feature mapping information.
Our results show that the accuracy drop is limited even after multiple sequences of DtM and DfM, especially for robust models.
arXiv Detail & Related papers (2020-07-13T05:27:48Z) - CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via
Cycle Training [63.11444020743543]
Deep learning models for graph-to-text (G2T) and text-to-graph (T2G) conversion suffer from scarce training data.
We present CycleGT, an unsupervised training method that can bootstrap from non-parallel graph and text data, and iteratively back translate between the two forms.
arXiv Detail & Related papers (2020-06-08T15:59:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.