PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
- URL: http://arxiv.org/abs/2205.12697v1
- Date: Wed, 25 May 2022 11:55:54 GMT
- Title: PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
- Authors: Ao Liu, Haoyu Dong, Naoaki Okazaki, Shi Han, Dongmei Zhang
- Abstract summary: We propose a PLOG (Pretrained Logical Form Generator) framework to improve the generation fidelity.
PLOG is first pretrained on a table-to-logic-form generation task, then finetuned on downstream table-to-text tasks.
PLOG can learn logical inference from table-logic pairs much more definitely than from table-text pairs.
- Score: 44.78200830757109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Logical table-to-text generation is a task that involves generating logically
faithful sentences from tables, which requires models to derive logical level
facts from table records via logical inference. It raises a new challenge on
the logical-level content planning of table-to-text models. However, directly
learning the logical inference knowledge from table-text pairs is very
difficult for neural models because of the ambiguity of natural language and
the scarcity of parallel data. Hence even large-scale pre-trained language
models present low logical fidelity on logical table-to-text. In this work, we
propose a PLOG (Pretrained Logical Form Generator) framework to improve the
generation fidelity. Specifically, PLOG is first pretrained on a
table-to-logic-form generation (table-to-logic) task, then finetuned on
downstream table-to-text tasks. The formal definition of logical forms enables
us to collect large amount of accurate logical forms from tables without human
annotation. In addition, PLOG can learn logical inference from table-logic
pairs much more definitely than from table-text pairs. To evaluate our model,
we further collect a controlled logical table-to-text dataset CONTLOG based on
an existing dataset. On two benchmarks, LOGICNLG and CONTLOG, PLOG outperforms
strong baselines by a large margin on the logical fidelity, demonstrating the
effectiveness of table-to-logic pretraining.
Related papers
- Empower Nested Boolean Logic via Self-Supervised Curriculum Learning [67.46052028752327]
We find that any pre-trained language models even including large language models only behave like a random selector in the face of multi-nested logic.
To empower language models with this fundamental capability, this paper proposes a new self-supervised learning method textitCurriculum Logical Reasoning (textscClr)
arXiv Detail & Related papers (2023-10-09T06:54:02Z) - Investigating the Robustness of Natural Language Generation from Logical
Forms via Counterfactual Samples [30.079030298066847]
State-of-the-art methods based on pre-trained models have achieved remarkable performance on the standard test dataset.
We question whether these methods really learn how to perform logical reasoning, rather than just relying on the spurious correlations between the headers of the tables and operators of the logical form.
We propose two approaches to reduce the model's reliance on the shortcut.
arXiv Detail & Related papers (2022-10-16T14:14:53Z) - Discourse-Aware Graph Networks for Textual Logical Reasoning [142.0097357999134]
Passage-level logical relations represent entailment or contradiction between propositional units (e.g., a concluding sentence)
We propose logic structural-constraint modeling to solve the logical reasoning QA and introduce discourse-aware graph networks (DAGNs)
The networks first construct logic graphs leveraging in-line discourse connectives and generic logic theories, then learn logic representations by end-to-end evolving the logic relations with an edge-reasoning mechanism and updating the graph features.
arXiv Detail & Related papers (2022-07-04T14:38:49Z) - Improving Logical-Level Natural Language Generation with
Topic-Conditioned Data Augmentation and Logical Form Generation [18.93964332724296]
We propose a topic-conditioned data augmentation (TopicDA) to generate logical forms and textual descriptions directly from tables.
We introduce logical form generation (LG), a dual task of Logic2text that requires generating a valid logical form based on a text description of a table.
We also propose a semi-supervised learning approach to jointly train a Logic2text and an LG model with both labeled and augmented data.
arXiv Detail & Related papers (2021-12-12T13:50:18Z) - Logic-Driven Context Extension and Data Augmentation for Logical
Reasoning of Text [65.24325614642223]
We propose to understand logical symbols and expressions in the text to arrive at the answer.
Based on such logical information, we put forward a context extension framework and a data augmentation algorithm.
Our method achieves the state-of-the-art performance, and both logic-driven context extension framework and data augmentation algorithm can help improve the accuracy.
arXiv Detail & Related papers (2021-05-08T10:09:36Z) - Logic2Text: High-Fidelity Natural Language Generation from Logical Forms [84.5687465831598]
We formulate logical level NLG as generation from logical forms in order to obtain controllable, high-fidelity, and faithful generations.
We present a new large-scale dataset, textscLogic2Text, with 10,753 descriptions involving common logic types paired with the underlying logical forms.
arXiv Detail & Related papers (2020-04-30T04:06:06Z) - Logical Natural Language Generation from Open-Domain Tables [107.04385677577862]
We propose a new task where a model is tasked with generating natural language statements that can be emphlogically entailed by the facts.
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset citechen 2019tabfact featured with a wide range of logical/symbolic inferences.
The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order.
arXiv Detail & Related papers (2020-04-22T06:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.