Attention Is Indeed All You Need: Semantically Attention-Guided Decoding
for Data-to-Text NLG
- URL: http://arxiv.org/abs/2109.07043v1
- Date: Wed, 15 Sep 2021 01:42:51 GMT
- Title: Attention Is Indeed All You Need: Semantically Attention-Guided Decoding
for Data-to-Text NLG
- Authors: Juraj Juraska and Marilyn Walker
- Abstract summary: We propose a novel decoding method that extracts interpretable information from encoder-decoder models' cross-attention.
We show on three datasets its ability to dramatically reduce semantic errors in the generated outputs.
- Score: 0.913755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ever since neural models were adopted in data-to-text language generation,
they have invariably been reliant on extrinsic components to improve their
semantic accuracy, because the models normally do not exhibit the ability to
generate text that reliably mentions all of the information provided in the
input. In this paper, we propose a novel decoding method that extracts
interpretable information from encoder-decoder models' cross-attention, and
uses it to infer which attributes are mentioned in the generated text, which is
subsequently used to rescore beam hypotheses. Using this decoding method with
T5 and BART, we show on three datasets its ability to dramatically reduce
semantic errors in the generated outputs, while maintaining their
state-of-the-art quality.
Related papers
- Text2Data: Low-Resource Data Generation with Textual Control [104.38011760992637]
Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines.
We propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model.
It undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - PixT3: Pixel-based Table-To-Text Generation [66.96636025277536]
We present PixT3, a multimodal table-to-text model that overcomes the challenges of linearization and input size limitations.
Experiments on the ToTTo and Logic2Text benchmarks show that PixT3 is competitive and superior to generators that operate solely on text.
arXiv Detail & Related papers (2023-11-16T11:32:47Z) - Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge
Selection [71.20871905457174]
Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text.
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up of irrelevant references.
We present DKGen, which divide the text generation process into an iterative process.
arXiv Detail & Related papers (2023-08-30T02:22:40Z) - Controlled Text Generation using T5 based Encoder-Decoder Soft Prompt
Tuning and Analysis of the Utility of Generated Text in AI [2.381686610905853]
We introduce the novel soft prompt tuning method of using soft prompts at both encoder and decoder levels together in a T5 model.
We also investigate the feasibility of steering the output of this extended soft prompted T5 model at decoder level.
arXiv Detail & Related papers (2022-12-06T12:31:53Z) - Informative Text Generation from Knowledge Triples [56.939571343797304]
We propose a novel memory augmented generator that employs a memory network to memorize the useful knowledge learned during the training.
We derive a dataset from WebNLG for our new setting and conduct extensive experiments to investigate the effectiveness of our model.
arXiv Detail & Related papers (2022-09-26T14:35:57Z) - Reinforced Generative Adversarial Network for Abstractive Text
Summarization [7.507096634112164]
Sequence-to-sequence models provide a viable new approach to generative summarization.
These models have three drawbacks: their grasp of the details of the original text is often inaccurate, and the text generated by such models often has repetitions.
We propose a new architecture that combines reinforcement learning and adversarial generative networks to enhance the sequence-to-sequence attention model.
arXiv Detail & Related papers (2021-05-31T17:34:47Z) - Rethinking Text Line Recognition Models [57.47147190119394]
We consider two decoder families (Connectionist Temporal Classification and Transformer) and three encoder modules (Bidirectional LSTMs, Self-Attention, and GRCLs)
We compare their accuracy and performance on widely used public datasets of scene and handwritten text.
Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length.
arXiv Detail & Related papers (2021-04-15T21:43:13Z) - Adversarial Watermarking Transformer: Towards Tracing Text Provenance
with Data Hiding [80.3811072650087]
We study natural language watermarking as a defense to help better mark and trace the provenance of text.
We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training.
AWT is the first end-to-end model to hide data in text by automatically learning -- without ground truth -- word substitutions along with their locations.
arXiv Detail & Related papers (2020-09-07T11:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.