Improving Logical-Level Natural Language Generation with
Topic-Conditioned Data Augmentation and Logical Form Generation
- URL: http://arxiv.org/abs/2112.06240v1
- Date: Sun, 12 Dec 2021 13:50:18 GMT
- Title: Improving Logical-Level Natural Language Generation with
Topic-Conditioned Data Augmentation and Logical Form Generation
- Authors: Ao Liu, Congjian Luo, Naoaki Okazaki
- Abstract summary: We propose a topic-conditioned data augmentation (TopicDA) to generate logical forms and textual descriptions directly from tables.
We introduce logical form generation (LG), a dual task of Logic2text that requires generating a valid logical form based on a text description of a table.
We also propose a semi-supervised learning approach to jointly train a Logic2text and an LG model with both labeled and augmented data.
- Score: 18.93964332724296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Logical Natural Language Generation, i.e., generating textual descriptions
that can be logically entailed by a structured table, has been a challenge due
to the low fidelity of the generation. \citet{chen2020logic2text} have
addressed this problem by annotating interim logical programs to control the
generation contents and semantics, and presented the task of table-aware
logical form to text (Logic2text) generation. However, although table instances
are abundant in the real world, logical forms paired with textual descriptions
require costly human annotation work, which limits the performance of neural
models. To mitigate this, we propose topic-conditioned data augmentation
(TopicDA), which utilizes GPT-2 to generate unpaired logical forms and textual
descriptions directly from tables. We further introduce logical form generation
(LG), a dual task of Logic2text that requires generating a valid logical form
based on a text description of a table. We also propose a semi-supervised
learning approach to jointly train a Logic2text and an LG model with both
labeled and augmented data. The two models benefit from each other by providing
extra supervision signals through back-translation. Experimental results on the
Logic2text dataset and the LG task demonstrate that our approach can
effectively utilize the augmented data and outperform supervised baselines by a
substantial margin.
Related papers
- MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation [102.20036684996248]
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
We conduct experiments on two data-to-text generation tasks like WebNLG and LogicNLG.
arXiv Detail & Related papers (2022-12-16T17:36:23Z) - PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation [44.78200830757109]
We propose a PLOG (Pretrained Logical Form Generator) framework to improve the generation fidelity.
PLOG is first pretrained on a table-to-logic-form generation task, then finetuned on downstream table-to-text tasks.
PLOG can learn logical inference from table-logic pairs much more definitely than from table-text pairs.
arXiv Detail & Related papers (2022-05-25T11:55:54Z) - Logic-Consistency Text Generation from Semantic Parses [32.543257899910216]
This paper first proposes SNOWBALL, a framework for logic consistent text generation from semantic parses.
Second, we propose a novel automatic metric, BLEC, for evaluating the logical consistency between the semantic parses and generated texts.
arXiv Detail & Related papers (2021-08-02T01:12:18Z) - Logic-Driven Context Extension and Data Augmentation for Logical
Reasoning of Text [65.24325614642223]
We propose to understand logical symbols and expressions in the text to arrive at the answer.
Based on such logical information, we put forward a context extension framework and a data augmentation algorithm.
Our method achieves the state-of-the-art performance, and both logic-driven context extension framework and data augmentation algorithm can help improve the accuracy.
arXiv Detail & Related papers (2021-05-08T10:09:36Z) - Logic2Text: High-Fidelity Natural Language Generation from Logical Forms [84.5687465831598]
We formulate logical level NLG as generation from logical forms in order to obtain controllable, high-fidelity, and faithful generations.
We present a new large-scale dataset, textscLogic2Text, with 10,753 descriptions involving common logic types paired with the underlying logical forms.
arXiv Detail & Related papers (2020-04-30T04:06:06Z) - LogicalFactChecker: Leveraging Logical Operations for Fact Checking with
Graph Module Network [111.24773949467567]
We propose LogicalFactChecker, a neural network approach capable of leveraging logical operations for fact checking.
It achieves the state-of-the-art performance on TABFACT, a large-scale, benchmark dataset.
arXiv Detail & Related papers (2020-04-28T17:04:19Z) - Logical Natural Language Generation from Open-Domain Tables [107.04385677577862]
We propose a new task where a model is tasked with generating natural language statements that can be emphlogically entailed by the facts.
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset citechen 2019tabfact featured with a wide range of logical/symbolic inferences.
The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order.
arXiv Detail & Related papers (2020-04-22T06:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.