Smoothing Entailment Graphs with Language Models
- URL: http://arxiv.org/abs/2208.00318v2
- Date: Thu, 21 Sep 2023 19:27:13 GMT
- Title: Smoothing Entailment Graphs with Language Models
- Authors: Nick McKenna, Tianyi Li, Mark Johnson, Mark Steedman
- Abstract summary: We present a theory of optimal smoothing of Entailment Graphs built by Open Relation Extraction (ORE)
We demonstrate an efficient, open-domain, and unsupervised smoothing method using an off-the-shelf Language Model to find approximations of missing premise predicates.
In a QA task we show that EG smoothing is most useful for answering questions with lesser supporting text, where missing premise predicates are more costly.
- Score: 15.499215600170238
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The diversity and Zipfian frequency distribution of natural language
predicates in corpora leads to sparsity in Entailment Graphs (EGs) built by
Open Relation Extraction (ORE). EGs are computationally efficient and
explainable models of natural language inference, but as symbolic models, they
fail if a novel premise or hypothesis vertex is missing at test-time. We
present theory and methodology for overcoming such sparsity in symbolic models.
First, we introduce a theory of optimal smoothing of EGs by constructing
transitive chains. We then demonstrate an efficient, open-domain, and
unsupervised smoothing method using an off-the-shelf Language Model to find
approximations of missing premise predicates. This improves recall by 25.1 and
16.3 percentage points on two difficult directional entailment datasets, while
raising average precision and maintaining model explainability. Further, in a
QA task we show that EG smoothing is most useful for answering questions with
lesser supporting text, where missing premise predicates are more costly.
Finally, controlled experiments with WordNet confirm our theory and show that
hypothesis smoothing is difficult, but possible in principle.
Related papers
- Improving the Natural Language Inference robustness to hard dataset by data augmentation and preprocessing [1.7487745673871375]
Natural Language Inference (NLI) is the task of inferring whether the hypothesis can be justified by the given premise.
We propose the data augmentation and preprocessing methods to solve the word overlap, numerical reasoning and length mismatch problems.
arXiv Detail & Related papers (2024-12-10T01:49:23Z) - Graph Stochastic Neural Process for Inductive Few-shot Knowledge Graph Completion [63.68647582680998]
We focus on a task called inductive few-shot knowledge graph completion (I-FKGC)
Inspired by the idea of inductive reasoning, we cast I-FKGC as an inductive reasoning problem.
We present a neural process-based hypothesis extractor that models the joint distribution of hypothesis, from which we can sample a hypothesis for predictions.
In the second module, based on the hypothesis, we propose a graph attention-based predictor to test if the triple in the query set aligns with the extracted hypothesis.
arXiv Detail & Related papers (2024-08-03T13:37:40Z) - EntailE: Introducing Textual Entailment in Commonsense Knowledge Graph
Completion [54.12709176438264]
Commonsense knowledge graphs (CSKGs) utilize free-form text to represent named entities, short phrases, and events as their nodes.
Current methods leverage semantic similarities to increase the graph density, but the semantic plausibility of the nodes and their relations are under-explored.
We propose to adopt textual entailment to find implicit entailment relations between CSKG nodes, to effectively densify the subgraph connecting nodes within the same conceptual class.
arXiv Detail & Related papers (2024-02-15T02:27:23Z) - LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - From the One, Judge of the Whole: Typed Entailment Graph Construction
with Predicate Generation [69.91691115264132]
Entailment Graphs (EGs) are constructed to indicate context-independent entailment relations in natural languages.
In this paper, we propose a multi-stage method, Typed Predicate-Entailment Graph Generator (TP-EGG) to tackle this problem.
Experiments on benchmark datasets show that TP-EGG can generate high-quality and scale-controllable entailment graphs.
arXiv Detail & Related papers (2023-06-07T05:46:19Z) - Unveiling the Sampling Density in Non-Uniform Geometric Graphs [69.93864101024639]
We consider graphs as geometric graphs: nodes are randomly sampled from an underlying metric space, and any pair of nodes is connected if their distance is less than a specified neighborhood radius.
In a social network communities can be modeled as densely sampled areas, and hubs as nodes with larger neighborhood radius.
We develop methods to estimate the unknown sampling density in a self-supervised fashion.
arXiv Detail & Related papers (2022-10-15T08:01:08Z) - Generative Text Modeling through Short Run Inference [47.73892773331617]
The present work proposes a short run dynamics for inference. It is variation from the prior distribution of the latent variable and then runs a small number of Langevin dynamics steps guided by its posterior distribution.
We show that the models trained with short run dynamics more accurately model the data, compared to strong language model and VAE baselines, and exhibit no sign of posterior collapse.
arXiv Detail & Related papers (2021-05-27T09:14:35Z) - Learning Graphs from Smooth Signals under Moment Uncertainty [23.868075779606425]
We consider the problem of inferring the graph structure from a given set of graph signals.
Traditional graph learning models do not take this distributional uncertainty into account.
arXiv Detail & Related papers (2021-05-12T06:47:34Z) - Query Training: Learning a Worse Model to Infer Better Marginals in
Undirected Graphical Models with Hidden Variables [11.985433487639403]
Probabilistic graphical models (PGMs) provide a compact representation of knowledge that can be queried in a flexible way.
We introduce query training (QT), a mechanism to learn a PGM that is optimized for the approximate inference algorithm that will be paired with it.
We demonstrate experimentally that QT can be used to learn a challenging 8-connected grid Markov random field with hidden variables.
arXiv Detail & Related papers (2020-06-11T20:34:32Z) - Generalized Entropy Regularization or: There's Nothing Special about
Label Smoothing [83.78668073898001]
We introduce a family of entropy regularizers, which includes label smoothing as a special case.
We find that variance in model performance can be explained largely by the resulting entropy of the model.
We advise the use of other entropy regularization methods in its place.
arXiv Detail & Related papers (2020-05-02T12:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.