Related papers: Interactive Model with Structural Loss for Language-based Abductive Reasoning

Interactive Model with Structural Loss for Language-based Abductive Reasoning

URL: http://arxiv.org/abs/2112.00284v1
Date: Wed, 1 Dec 2021 05:21:07 GMT
Title: Interactive Model with Structural Loss for Language-based Abductive Reasoning
Authors: Linhao Li, Ming Xu, Yongfeng Dong, Xin Li, Ao Wang, Qinghua Hu
Abstract summary: The abductive natural language inference task ($alpha$NLI) is proposed to infer the most plausible explanation between the cause and the event. We name this new model for $alpha$NLI: Interactive Model with Structural Loss (IMSL) Our IMSL has achieved the highest performance on the RoBERTa-large pretrained model, with ACC and AUC results increased by about 1% and 5% respectively.
Score: 36.02450824915494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The abductive natural language inference task ($\alpha$NLI) is proposed to infer the most plausible explanation between the cause and the event. In the $\alpha$NLI task, two observations are given, and the most plausible hypothesis is asked to pick out from the candidates. Existing methods model the relation between each candidate hypothesis separately and penalize the inference network uniformly. In this paper, we argue that it is unnecessary to distinguish the reasoning abilities among correct hypotheses; and similarly, all wrong hypotheses contribute the same when explaining the reasons of the observations. Therefore, we propose to group instead of ranking the hypotheses and design a structural loss called ``joint softmax focal loss'' in this paper. Based on the observation that the hypotheses are generally semantically related, we have designed a novel interactive language model aiming at exploiting the rich interaction among competing hypotheses. We name this new model for $\alpha$NLI: Interactive Model with Structural Loss (IMSL). The experimental results show that our IMSL has achieved the highest performance on the RoBERTa-large pretrained model, with ACC and AUC results increased by about 1\% and 5\% respectively.

Related papers

ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models [75.05436691700572]
We introduce ExpliCa, a new dataset for evaluating Large Language Models (LLMs) in explicit causal reasoning. We tested seven commercial and open-source LLMs on ExpliCa through prompting and perplexity-based metrics. Surprisingly, models tend to confound temporal relations with causal ones, and their performance is also strongly influenced by the linguistic order of the events.
arXiv Detail & Related papers (2025-02-21T14:23:14Z)
Graph Stochastic Neural Process for Inductive Few-shot Knowledge Graph Completion [63.68647582680998]
We focus on a task called inductive few-shot knowledge graph completion (I-FKGC) Inspired by the idea of inductive reasoning, we cast I-FKGC as an inductive reasoning problem. We present a neural process-based hypothesis extractor that models the joint distribution of hypothesis, from which we can sample a hypothesis for predictions. In the second module, based on the hypothesis, we propose a graph attention-based predictor to test if the triple in the query set aligns with the extracted hypothesis.
arXiv Detail & Related papers (2024-08-03T13:37:40Z)
Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation [0.0]
This study explores the capability of Large Language Models to evaluate causality in causal graphs. Our study compares two approaches: (1) prompting-based method for zero-shot and few-shot causal inference and, (2) fine-tuning language models for the causal relation prediction task.
arXiv Detail & Related papers (2024-05-29T09:06:18Z)
How often are errors in natural language reasoning due to paraphrastic variability? [29.079188032623605]
We propose a metric for evaluating the paraphrastic consistency of natural language reasoning models. We mathematically connect this metric to the proportion of a model's variance in correctness attributable to paraphrasing. We collect ParaNLU, a dataset of 7,782 human-written and validated paraphrased reasoning problems.
arXiv Detail & Related papers (2024-04-17T20:11:32Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals. It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation. It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z)
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence [45.9949173746044]
We show that large-size pre-trained language models (PLMs) do not satisfy the logical negation property (LNP) We propose a novel intermediate training task, names meaning-matching, designed to directly learn a meaning-text correspondence. We find that the task enables PLMs to learn lexical semantic information.
arXiv Detail & Related papers (2022-05-08T08:37:36Z)
Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences. Models that understand entailment should encode both, the premise and the hypothesis. Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z)
Modeling Voting for System Combination in Machine Translation [92.09572642019145]
We propose an approach to modeling voting for system combination in machine translation. Our approach combines the advantages of statistical and neural methods since it can not only analyze the relations between hypotheses but also allow for end-to-end training.
arXiv Detail & Related papers (2020-07-14T09:59:38Z)
L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system. A novel $L2R2$ approach is proposed under the learning-to-rank framework. Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.