Logical Natural Language Generation from Open-Domain Tables
- URL: http://arxiv.org/abs/2004.10404v2
- Date: Tue, 28 Apr 2020 00:26:21 GMT
- Title: Logical Natural Language Generation from Open-Domain Tables
- Authors: Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen and William Yang Wang
- Abstract summary: We propose a new task where a model is tasked with generating natural language statements that can be emphlogically entailed by the facts.
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset citechen 2019tabfact featured with a wide range of logical/symbolic inferences.
The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order.
- Score: 107.04385677577862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural natural language generation (NLG) models have recently shown
remarkable progress in fluency and coherence. However, existing studies on
neural NLG are primarily focused on surface-level realizations with limited
emphasis on logical inference, an important aspect of human thinking and
language. In this paper, we suggest a new NLG task where a model is tasked with
generating natural language statements that can be \emph{logically entailed} by
the facts in an open-domain semi-structured table. To facilitate the study of
the proposed logical NLG problem, we use the existing TabFact dataset
\cite{chen2019tabfact} featured with a wide range of logical/symbolic
inferences as our testbed, and propose new automatic metrics to evaluate the
fidelity of generation models w.r.t.\ logical inference. The new task poses
challenges to the existing monotonic generation frameworks due to the mismatch
between sequence order and logical order. In our experiments, we
comprehensively survey different generation architectures (LSTM, Transformer,
Pre-Trained LM) trained with different algorithms (RL, Adversarial Training,
Coarse-to-Fine) on the dataset and made following observations: 1) Pre-Trained
LM can significantly boost both the fluency and logical fidelity metrics, 2) RL
and Adversarial Training are trading fluency for fidelity, 3) Coarse-to-Fine
generation can help partially alleviate the fidelity issue while maintaining
high language fluency. The code and data are available at
\url{https://github.com/wenhuchen/LogicNLG}.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - Autoformalizing Natural Language to First-Order Logic: A Case Study in Logical Fallacy Detection [44.31755414036022]
We introduce Natural Language to First-Order Logic (NL2FOL), a framework to autoformalize natural language to FOL step by step using Large Language Models (LLMs)
Our approach addresses key challenges in this translation process, including the integration of implicit background knowledge.
Being neurosymbolic, our approach also provides interpretable insights into the reasoning process and demonstrates robustness without requiring model fine-tuning or labeled training data.
arXiv Detail & Related papers (2024-04-18T00:20:48Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - State space models can express n-gram languages [51.823427608117626]
We build state space language models that can solve the next-word prediction task for languages generated from n-gram rules.
Our proof shows how SSMs can encode n-gram rules using new theoretical results on their capacity.
We conduct experiments with a small dataset generated from n-gram rules to show how our framework can be applied to SSMs and RNNs obtained through gradient-based optimization.
arXiv Detail & Related papers (2023-06-20T10:41:23Z) - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation [102.20036684996248]
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
We conduct experiments on two data-to-text generation tasks like WebNLG and LogicNLG.
arXiv Detail & Related papers (2022-12-16T17:36:23Z) - PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation [44.78200830757109]
We propose a PLOG (Pretrained Logical Form Generator) framework to improve the generation fidelity.
PLOG is first pretrained on a table-to-logic-form generation task, then finetuned on downstream table-to-text tasks.
PLOG can learn logical inference from table-logic pairs much more definitely than from table-text pairs.
arXiv Detail & Related papers (2022-05-25T11:55:54Z) - Improving Logical-Level Natural Language Generation with
Topic-Conditioned Data Augmentation and Logical Form Generation [18.93964332724296]
We propose a topic-conditioned data augmentation (TopicDA) to generate logical forms and textual descriptions directly from tables.
We introduce logical form generation (LG), a dual task of Logic2text that requires generating a valid logical form based on a text description of a table.
We also propose a semi-supervised learning approach to jointly train a Logic2text and an LG model with both labeled and augmented data.
arXiv Detail & Related papers (2021-12-12T13:50:18Z) - NeuralLog: Natural Language Inference with Joint Neural and Logical
Reasoning [6.795509403707242]
We propose an inference framework called NeuralLog, which utilizes both a monotonicity-based logical inference engine and a neural network language model for phrase alignment.
Our framework models the NLI task as a classic search problem and uses the beam search algorithm to search for optimal inference paths.
Experiments show that our joint logic and neural inference system improves accuracy on the NLI task and can achieve state-of-art accuracy on the SICK and MED datasets.
arXiv Detail & Related papers (2021-05-29T01:02:40Z) - Logic2Text: High-Fidelity Natural Language Generation from Logical Forms [84.5687465831598]
We formulate logical level NLG as generation from logical forms in order to obtain controllable, high-fidelity, and faithful generations.
We present a new large-scale dataset, textscLogic2Text, with 10,753 descriptions involving common logic types paired with the underlying logical forms.
arXiv Detail & Related papers (2020-04-30T04:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.